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Abstract 

These  notes  were  written  for  an  introductory  course  on  the  application  of  multigrid 
methods  to  elliptic  and  hyperbolic  partial  differential  equations  for  engineers,  physicists  and 
applied  mathematicians.  The  use  of  more  advanced  mathematical  tools,  such  as  functional 
analysis,  is  avoided.  The  course  is  intended  to  be  accessible  to  a  wide  audience  of  users 
of  computational  methods.  We  restrict  ourselves  to  finite  volume  and  finite  difference  dis¬ 
cretization.  The  basic  principles  are  given.  Smoothing  methods  and  Fourier  smoothing 
analysis  are  reviewed.  The  fundamental  multigrid  algorithm  is  studied.  The  smoothing 
and  coarse  grid  approximation  properties  are  discussed.  Multigrid  schedules  and  structured 
programming  of  multigrid  algorithms  are  treated.  Robustness  and  efficiency  are  considered. 
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1  Introduction 


Readership 

The  purpose  of  these  notes  is  to  present,  at  graduate  level,  an  introduction  to  the  application 
of  naultigrid  methods  to  elliptic  and  hyperbolic  partial  differential  equations  for  engineers, 
physicists  and  applied  mathematicians.  The  reader  is  assumed  to  be  familiar  with  the  basics 
of  the  analysis  of  partial  differential  equations  and  of  numerical  mathematics,  but  the  use 
of  more  advanced  mathematical  tools,  such  as  functional  analysis,  is  avoided.  The  course 
is  intended  to  be  accessible  to  a  wide  audience  of  users  of  computational  methods.  We  do 
not,  therefore,  delve  deeply  into  the  mathematical  foundations.  This  is  done  in  the  excellent 
monograph  by  Hackbusch  [57],  which  treats  many  aspects  of  multigrid,  and  also  contains 
many  practical  details.  The  book  [141]  is  more  accessible  to  non-mathematicians,  and  pays 
more  attention  to  applications,  especially  in  computational  fluid  dynamics. 

Other  introductory  material  can  be  found  in  the  article  Brandt  [20],  the  first  three  chap¬ 
ters  of  [85]  and  the  short  elementary  introduction  [27].  The  notes  are  based  on  parts  of  [141], 
where  further  details  may  be  found,  and  other  subjects  are  discussed,  notably  applications  in 
computational  fluid  dynamics. 

Significance  of  multigrid  methods  for  scientific  computation 

Needless  to  say,  elliptic  and  hyperbolic  partial  differential  equations  are,  by  and  large,  at  the 
heart  of  most  mathematical  models  used  in  engineering  and  physics,  giving  rise  to  extensive 
computations.  Often  the  problems  that  one  would  like  to  solve  exceed  the  capacity  of  even 
the  most  powerful  computers,  or  the  time  required  is  too  great  to  allow  inclusion  of  advanced 
mathematical  models  in  the  design  process  of  technical  apparatus,  from  microchips  to  aircraft, 
making  design  optimization  more  difficult.  Multigrid  methods  are  a  prime  source  of  impor¬ 
tant  advances  in  algorithmic  efficiency,  finding  a  rapidly  increasing  number  of  users.  Unlike 
other  known  methods,  multigrid  offers  the  possibility  of  solving  problems  with  N  unknowns 
with  0{N)  work  and  storage,  not  just  for  special  cases,  but  for  large  classes  of  problems. 

Historical  development  of  multigrid  methods 

Table  1.0.1,  based  on  the  multigrid  bibliography  in  [85],  illustrates  the  rapid  growth  of  the 
multigrid  literature,  a  growth  which  has  continued  unabated  since  1985. 

As  shown  by  Table  1.0.1,  multigrid  methods  have  been  developed  only  recently.  In  what 
probably  was  the  first  ‘true’  multigrid  publication,  Fedorenko  [43]  formulated  a  multigrid  al¬ 
gorithm  for  the  standard  five-point  finite  difference  discretization  of  the  Poisson  equation  on 
a  square,  proving  that  the  work  required  to  reach  a  given  precison  is  0{N).  This  work  was 
generalized  to  the  central  difference  discretization  of  the  general  linear  elliptic  partial  differ¬ 
ential  equation  (3.2.1)  in  fi  =  (0, 1)  x  (0, 1)  with  variable  smooth  coefficients  by  Bachvalov  [8]. 
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The  theoretical  work  estimates  were  pessimistic,  and  the  method  was  not  put  into  practice  at 
the  time.  The  first  practical  results  were  reported  in  a  pioneering  paper  by  Brandt  [19],  who 
published  another  paper  in  1977  [20],  clearly  outlining  the  main  principles  and  the  practical 
utility  of  multigrid  methods,  which  drew  wide  attention  and  marked  the  beginning  of  rapid 
development.  The  multigrid  method  was  discovered  independently  by  Hackbusch  [50],  who 
laid  firm  mathematical  foundations  and  provided  reliable  methods  ([52],  [53],  [54]).  A  re¬ 
port  by  Frederickson  [47]  describing  an  efficient  multigrid  algorithm  for  the  Poisson  equation 
led  the  present  author  to  the  development  of  a  similar  method  for  the  vorticity-stream  func¬ 
tion  formulation  of  the  Navier-Stokes  equations,  resulting  in  an  efficient  method  ([135],  [143]). 

At  first  there  was  much  debate  and  scepticism  about  the  true  merits  of  multigrid  methods. 
Only  after  sufficient  initiation  satisfactory  results  could  be  obtained.  This  led  a  number  of 
researchers  to  the  development  of  stronger  and  more  transparent  convergence  proofs  ([4],  [93], 
[94],  [51],  [54],  [136],  [137])  (see  [57]  for  a  survey  of  theoretical  developments).  Although  rate 
of  convergence  proofs  of  multigrid  methods  are  complicated,  their  structure  has  now  become 
more  or  less  standartized  and  trasparenr.  Other  outhors  have  tried  to  spead  confidence  in 
multigrid  methods  by  providing  efficient  and  reliable  computer  programs,  as  much  as  possi¬ 
ble  of  ‘black-box’  type,  for  uninitiated  users.  A  survey  will  be  given  later.  The  ‘multigrid 
guide’  of  Brandt  ([16],  [23])  was  provided  to  give  guidelines  for  researchers  writing  their  own 
multigrid  programs. 


Year  64  66  71  72  73  75  76  77  78  79  80  81  82  83  84  85 

Number  1  1  1  1  1  1  3  11  10  22  31  70  78  96  94  149 


Table  1.0.1:  Years  number  of  multigrid  publications 


Scope  of  these  notes 

The  following  topics  will  not  be  treated  here:  parabolic  equations,  eigenvalue  problems  and 
integral  equations.  For  an  introduction  to  the  application  of  multigrid  methods  to  these 
subjects,  see  [56],  [57]  and  [18].  There  is  relatively  little  material  in  these  areas,  although 
multigrid  can  be  applied  profitably.  For  important  recent  advances  in  the  field  of  integral 
equations,  see  [25]  and  [130].  A  recent  publication  on  parabolic  multigrid  is  [91].  Finite 
element  methods  will  not  be  discussed,  but  finite  volume  and  finite  difference  discretization 
will  be  taken  as  the  point  of  departure.  Although  most  theoretical  work  has  been  done  in  a 
variational  framework,  most  applications  use  finite  volumes  or  finite  differences.  The  princi¬ 
ples  are  the  same,  however,  and  the  reader  should  have  no  difficulty  in  applying  the  principles 
outlined  in  this  book  in  a  finite  element  context. 
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Multigrid  principles  are  much  more  widely  applicable  than  just  to  the  numerical  solution 
of  differential  and  integral  equations.  Applications  in  such  diverse  areas  as  control  theory, 
optimization,  pattern  recognition,  computational  tomography  and  particle  physics  are  begin¬ 
ning  to  appear.  For  a  survey  of  the  wide  ranging  applicability  of  multigrid  principles,  see  [17], 
[18]. 

Notation 

The  notation  is  explained  as  it  occurs.  Latin  letter  like  u  denote  unknown  functions.  The 
bold  version  u  denotes  a  grid  function,  with  value  Uj  in  grid  point  Xj,  intended  as  the  discrete 
approximation  of  u{xj). 

2  The  basic  principle  of  multigrid  methods  for  partial  differ¬ 
ential  equations 

2.1  Introduction 

In  this  chapter,  the  basic  principle  of  multigrid  for  partial  differential  equations  will  be  ex¬ 
plained  by  studying  a  one-dimensional  model  problem.  Of  course,  one-dimensional  problems 
do  not  require  application  of  multigrid  methods,  since  for  the  algebraic  systems  that  result 
from  discretization  direct  solution  is  efficient,  but  in  one  dimension  multigrid  methods  can  be 
analysed  by  elementary  methods,  and  their  essential  principle  is  easily  demonstrated. 

Introductions  to  the  basic  principles  of  multigrid  methods  are  given  by  [20],  [27],  [28]  and 
[141].  More  advanced  expositions  are  given  by  [112],  [16]  and  [57],  Chapter  2. 

2.2  The  basic  principle 

One-dimensional  model  problem 

The  following  model  problem  will  be  considered 

—  d'^uj dx^  —  f{x)  in  ft  =  (0,1),  u(0)  =  du(l)/da;  =  0  (2.2.1) 

A  computational  grid  is  defined  by 

G  =  {x  ^  IR  :  X  =  Xj  =  jh,  j  =  1,2,  ...,2n,  h  =  l/2u}  (2.2.2) 


The  points  {a:^}  are  called  the  vertices  of  the  grid. 
Equation  (2.2.1)  is  discretized  with  finite  differences  as 

/i"^(2ui  -  U2)  =  fi 
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(2.2.3) 


h  (  "I"  2^^  Uj^i^  —  fjj  j  2,3,  ...)2tc  1 

1 

h  ^{—V,2n-l  +  U2n)  =  2'^^” 

where  fj  =  f{xj)  and  Uj  is  intended  to  approximate  u{xj).  The  solution  of  Equation  (2.2.1) 
is  denoted  by  u,  the  solution  of  Equation  (2.2.3)  by  u  and  the  value  of  m  in  Xj  by  uj  Uj 
approximates  the  solution  in  the  vertex  Xj]  thus  Equation  (2.2.3)  is  called  a  vertex-centered 
discretization.  The  number  of  meshes  in  G  is  even,  to  facilitate  application  of  a  two-grid 
method.  The  system  (2.2.3)  is  denoted  by 

Au  =  f  (2.2.4) 

Gauss-Seidel  iteration 

In  multidimensional  applications  of  finite  diflFerence  methods,  the  matrix  A  is  large  and  sparse, 
and  the  non-zero  pattern  has  a  regular  structure.  These  circumstances  favour  the  use  of 
iterative  methods  for  solving  (2.2.4).  We  will  present  one  such  method.  Indicating  the  mth 
iterand  by  a  superscript  m,  the  Gauss-Seidel  iteration  method  for  solving  (2.2.3)  is  defined 
by,  assuming  an  initial  guess  is  given, 

2<  =  <-1-1- hVi 

-<_i+2<  =  i  =  2,3,...,2n-l  (2.2.5) 

Fourier  analysis  of  convergence 

For  ease  of  analysis,  we  replace  the  boundary  conditions  by  periodic  boundary  conditions: 

u(l)  =  u(0)  (2.2.6) 

Then  the  error  e™  =  u"*  -  ■u°°  is  periodic  and  satisfies 

-  +  2er  =  ,  ef  =  (2.2.7) 

As  will  be  discussed  in  more  detail  later,  such  a  periodic  grid  function  can  be  represented  by 
the  following  Fourier  series: 

ef  =  Yh  Oa  =  T^oiln  (2.2.8) 

a=— n+1 

Because  of  the  orthogonality  of  it  suffies  to  substitute  in  (2.2.7). 

This  gives  ef  =  c^e''^^°‘  with 

ff(^a)  =  e''“/(2-e-‘^“)  (2.2.9) 
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The  function  ff(ffa)  is  called  the  amplification  factor.  It  measures  the  growth  or  decay  of  a 
Fourier  mode  of  the  error  during  an  iteration.  We  find 

l5(^a)|  =  (5  -  4  cos (2.2.10) 

At  first  it  seems  that  Gauss-Seidel  does  not  converge,  because 

max{|£f(0a)l  :  9a  =  7ra/n,  a  =  — n  +  1,  —n  +  2,  ...,n}  =  |5(0)|  =  1  (2.2.11) 

However,  with  periodic  boundary  conditions  the  solution  of  (2.2.1)  is  determined  up  to  a 
constant  only,  so  that  there  is  no  need  to  require  that  the  Fourier  mode  0  =  0  decays  during 
iteration.  Equation  (2.2.11),  therefore,  is  not  a  correct  measure  of  convergence,  but  the 
following  quantity  is: 

max{|5(0a)|  :  =  Jra/n,  a  =  — n  +  1,  — n  +  2,  ...,n,a  7^  0}  =  |g'(0i)| 

=  +  +  (2.2.12) 

It  follows  that  the  rate  of  convergence  deteriorates  as  h  J,  0.  Apart  from  special  cases, 
in  the  context  of  elliptic  equations  this  is  found  to  be  true  of  all  socaHed  basic  iterative 
methods  (more  on  these  later;  well  known  examples  are  the  Jacobi,  Gauss-Seidel  and  successive 
over-relaxation  methods)  by  which  a  grid  function  value  is  updated  using  only  neighbouring 
vertices.  This  deterioration  of  rate  of  convergence  is  found  to  occur  also  with  other  kinds  of 
boundary  conditions.  The  purpose  of  multigrid  is  to  avoid  this  deterioration,  and  to  achieve 
a  rate  of  convergence  which  is  independent  of  h. 

The  essential  multigrid  principle 

The  rate  of  convergence  of  basic  iterative  methods  can  be  improved  with  multigrid  methods. 
The  basic  observation  is  that  (2.2.10)  shows  that  |5(^a)|  decreases  as  a  increases.  This 
means  that,  although  long  waveUength  Fourier  modes  (a  close  to  1)  decay  slowly  (|^f(^„)|  = 
1  —  0{h^)),  short  wavelength  Fourier  modes  are  reduced  rapidly.  The  essential  multigrid 
principle  is  to  approximate  the  smooth  (long  wavelength)  part  of  the  error  on  coarser  grids. 
The  non-smooth  or  rough  part  is  reduced  with  a  small  number  (independent  of  h)  of  iterations 
with  a  basic  iterative  method  on  the  fine  grid. 

Fourier  smoothing  analysis 

In  order  to  be  able  to  verify  whether  a  basic  iterative  method  gives  a  good  reduction  of  the 
rough  part  of  the  error,  the  concept  of  roughness  has  to  be  defined  precisely. 

Definition  2.2.1  The  set  of  rough  wavenumbers  0,  is  defined  by 

0,.  =  {fia  =  -KOLjn,  la|  >  cn,  a  =  —n  +  1,  — n  -H  2, ...,  n}  (2.2.13) 


5 


where  0  <  c  <  1  is  fixed  constant  independent  of  n. 

The  performance  of  a  smoothing  method  is  measured  by  its  smoothing  factor  p,  defined  as 
follows. 

Definition  2.2.2  The  smoothing  factor  p  is  defined  by 

p  =  max{|5(0o,)|  :  9a  €  0r}  (2.2.14) 

When  for  a  basic  iterative  method  />  <  1  is  bounded  away  from  1  uniformly  in  h,  we  say  that 
the  method  is  a  smoother.  Note  that  p  depends  on  the  iterative  method  and  on  the  problem. 
For  Gauss-Seidel  and  the  present  model  problem  p  is  easily  determined.  Equation  (2.2.10) 
shows  that  |if|  decreases  monotonicaUy,  so  that 

p  =  (5  —  4cosc7r)“^'^^  (2.2.15) 

Hence,  for  the  present  problem  Gauss-Seidel  is  a  smoother. 

It  is  convenient  to  standardize  the  choice  of  c.  Only  the  Fourier  modes  that  cannot  be 
represented  on  the  coarse  grid  need  to  be  reduced  by  the  basic  iterative  method;  thus  it  is 
natural  to  let  these  modes  constitute  0^.  We  choose  the  coarse  grid  by  doubling  the  mesh-size 
of  G.  The  Fourier  modes  on  this  grid  have  wavenumbers  da  given  by  (2.2.8)  with  n  replaced 
by  n/2  (assuming  for  simplicity  n  to  be  even).  The  remaining  wavenumbers  are  defined  to 
be  non-smooth,  and  are  given  by  (2.2.13)  with 

c=l/2  (2.2.16) 

Equation  (2.2.15)  then  gives  the  following  smoothing  factor  for  Gauss-Seidel 

p  =  5-^/2  (2.2.17) 

This  type  of  Fourier  smoothing  analysis  was  originally  introduced  by  Brandt  [20].  It  is  a 
useful  and  simple  tool.  When  the  boundary  conditions  are  not  periodic,  its  predictions  are 
found  to  remain  qualitatively  correct,  except  in  the  case  of  singular  perturbation  problems, 
to  be  discussed  later. 

With  smoothly  varying  coefficients,  experience  shows  that  a  smoother  which  performs  well 
in  the  ‘frozen  coefficient’  case,  will  also  perform  weU  for  variable  coefficients.  By  the  ‘frozen 
coefficient’  case  we  mean  a  set  of  constant  coefficient  cases,  with  coefficient  values  equal  to  the 
values  of  the  variable  coefficients  under  consideration  in  a  sufficiently  large  sample  of  points 
in  the  domain. 
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Exercise  2.2.1  Determine  the  smoothing  factor  of  the  damped  Jacobi  method  (defined  later) 
to  problem  (2.2.5)  with  boundary  conditions  (2.2.6).  Note  that  with  damping  parameter  a;  =  1 
this  is  not  a  smoother. 

Exercise  2.2.2  Determine  the  smoothing  factor  of  the  Jacobi  method  applied  to  problem 
(2.2.5)  with  Dirichlet  boundary  conditions  u(0)  =  u(l)  =  0,  by  using  the  Fourier  sine  series. 
Note  that  the  smoothing  factor  is  the  same  as  obtained  with  the  exponential  Fourier  series. 

Exercise  2.2.3  Determine  the  smoothing  factor  of  the  Gauss-Seidel  method  for  central 
discretization  of  the  convection- diffusion  equation  cdu/dx  —  sd?u/dx'^  =  /.  Show  that  for 
\c\h/e  >  1  and  c  <  — 1  we  have  no  smoother. 

2.3  The  two-grid  algorithm 
Coarse  grid  approximation 

A  coarse  grid  G  is  defined  by  doubling  the  mesh-size  of  G: 

G  —  {x  E  IR  :  X  =  Xj  =  jh,  j  =  1,2,  ...,n,  h  =  1/n}  (2.3.1) 

The  vertices  of  G  also  belong  to  G;  thus  this  is  called  vertex-centered  coarsening.  The  original 
grid  G  is  called  the  fine  grid.  Let 


U  :G-^R,  U  :G^  R  (2.3.2) 

be  the  sets  of  fine  and  coarse  grid  functions,  respectively.  A  prolongation  operator  P  :  U  ^  U 
is  defined  by  linear  interpolation: 

Pu2j  —  (2.3.3) 

Overbars  indicate  coarse  grid  quantities.  A  restriction  operator  R:  U  U  is  defined  by  the 
following  weighted  average 


Ruj  = 


-U2j_l  -I-  -U2j  -b  -U2j+1 


(2.3.4) 


where  Uj  is  defined  to  be  zero  outside  G.  Note  that  the  matrices  P  and  R  are  related  by 
R  =  but  this  property  is  not  essential. 

The  fine  grid  equation  (2.2.4)  must  be  approximated  by  a  coarse  grid  equation 


Au  =  f 
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Like  the  finite  grid  matrix  A,  the  coarse  grid  matrix  A  may  be  obtained  by  discretizing 
Equation  (2.2.1).  This  is  called  discretization  coarse  grid  approximation.  An  alternative  is 
the  following.  The  fine  grid  problem  (2.2.4)  is  equivalent  to 

{Au,v)  =  {f,v),  u  G  17,  Vu  e  17  (2.3.5) 

with  (., .)  the  standard  inner  product  on  U.  We  want  to  find  an  approximated  solution  Pu 
with  u  £  U.  This  entails  restriction  of  the, test  functions  v  to  a  subspace  with  the  same 
dimension  as  {7,  that  is,  test  functions  of  the  type  Pv  with  v  £  U,  and  P  a  prolongation 
operator  that  may  be  different  from  P: 

{APu,  Pv)  =  if,  Pv),  u£U,^v£U  (2.3.6) 

or 

{P*APu,v)  =  iP*f,v),  u£U,yv£U  (2.3.7) 

where  now  of  course  (., .)  is  over  U,  and  superscript  *  denotes  the  adjoint  (or  transpose  in 
this  case).  Equation  (2.3.7)  is  equivalent  to 

Au  =  f  (2.3.8) 

with 

A  =  RAP  (2.3.9) 

and  f  z=  Rf',  we  have  replaced  P*  by  R.  This  choice  of  A  is  called  Galerkin  coarse  grid 

approximation. 

With  A,  P  and  R  given  by  (2.2.3),  (2.3.3)  and  (2.3.4),  Equation  (2.3.9)  results  in  the  following 

A 

Aui  =  h~^i2ui  -  U2) 

Auj  =  h~^(— Uj-i  +  2uj  —  Uj+i)  ,  j  =  2,3,  1  (2.3.10) 

AUji  —  U  (  U,!—!  T  '^n) 

which  is  the  coarse  grid  equivalent  of  the  left-hand  side  of  (2.2.3).  Hence,  in  the  present 
case  there  is  no  difference  between  Galerkin  and  discretization  coarse  grid  approximation. 
The  derivation  of  (2.3.10)  is  discussed  in  Exercise  2.3.1.  The  formula  (2.3.9)  has  theoretical 
advantages,  as  we  shall  see. 

Coarse  grid  correction 

Let  ii  be  an  approximation  to  the  solution  of  (2.2.4).  The  error  e  =  u—u  is  to  be  approximated 
on  the  coarse  grid.  We  have 

Ae  =  —7*  =  Art  —  f  (2.3.11) 
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The  coarse  grid  approximation  u  of  -e  satisfies 

Au  =  Rr  (2.3.12) 

In  a  two-grid  method  it  is  assumed  that  (2.3.12)  is  solved  exactly,  the  coarse  grid  correction 
to  be  added  to  ii  is  Pu\ 

it  :=  ii  +  Pit  (2.3.13) 

Linear  two-grid  algorithm 

The  two-grid  algorithm  for  linear  problems  consists  of  smoothing  on  the  fine  grid,  approxima¬ 
tion  of  the  required  correction  on  the  coarse  grid,  prolongation  of  the  coarse  grid  correction 
to  the  fine  grid,  and  again  smoothing  on  the  fine  grid.  The  precise  definition  of  the  two-grid 
algorithm  is 

comment  Two-grid  algorithm; 

Initialize 

for  i  :=  1  step  1  until  ntg  do 

T  f  — 
u  ;=  A~^Rr\ 

^2/3  _  +  p^. 

v}  :=  5(^2/^, A,/,  1/2); 

vP  :=  vP ; 

od  (2.3.14) 

The  number  of  two-grid  iterations  carried  out  is  ntg.  S{yP,  A,  f,vx)  stands  for  ui  smoothing 
iterations,  for  example  with  the  Gauss-Seidel  method  discussed  earlier,  applied  to  Au  =  f , 
starting  with  The  first  application  of  S  is  called  pre-smoothing,  the  second  post-smoothing. 

Exercise  2.3.1  Derive  (2.3.10)  (Hint.  It  is  easy  to  write  down  RAu,  in  the  interior  and  at 
the  boundaries.  Next,  one  replaces  u,-  by  Pui.) 

2.4  Two-grid  analysis 

The  purpose  of  two-grid  analysis  (as  of  multigrid  analysis)  is  to  show  that  the  rate  of  conver¬ 
gence  is  independent  of  the  mesh-size  h.  We  will  analyse  algorithm  (2.3.14)  for  the  special 
case  1^1  =  0  (no-presmoothing). 

Coarse  grid  correction 

From  (2.3.14)  it  follows  that  after  coarse  grid  correction  the  error  —  u  satisfies 

e2/3  ^  gl/3  ^  p^l/3  ^  pgl/3  (2.4.1) 
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with  the  iteration  matrix  or  error  amplification  matrix  E  defined  by 

E  =  I  -  PA~^RA  (2.4.2) 


We  will  express  explicitly  in  terms  of  This  is  possible  only  in  the  present  simple 

one- dimensional  case,  which  is  our  main  motivation  for  studying  this  case.  Let 


=  d  +  Pe,  with  ej  =  e^^ 

(2.4.3) 

Then  it  follows  that 

=  Ee^!^  =  Ed 

(2.4.4) 

We  find  from  (2.4.3)  that 

d2j 

_  j  1  1/3  ,  1/3  1  1/3 

=  0,  d2j+l  =  --Cj  +  62^+1  -  2®2j-t-2 

(2.4.5) 

Furthermore,  from  (2.4.5)  it  follows  that 

RAd  =  0 

(2.4.6) 

so  that 

e2/3  =  d 

(2.4.7) 

Smoothing 

Next,  we  consider  the  effect  of  post-smoothing  by  one  Gauss-Seidel  iteration.  From  (2.2.5)  it 
follows  that  the  error  after  post-smoothing  —  «  is  related  to  by 

2«!  =  er 

-e}_i-f-2e]  =  j  =  2, 3,  ...,2n  -  1  (2.4.8) 

-^2n-l  +^2n  =  ^ 

Using  (2.4.5)(2.4.7)  this  can  be  rewritten  as 
el  =  0 

’  ^2j+i  —  ’  J  =  1) 2, 1  (2.4.9) 

A  -  J 

®2n  —  ®2n-l 

By  induction  it  is  easy  to  see  that 

I4l  <  |ll‘^llco  ,  \\d\\oo  =  max{|d,|  :  j  =  1,2,  ...,2»}  (2.4.10) 
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Since  d  =  we  see  that  Gauss-Seidel  reduces  the  maximum  norm  of  the  error  by  a  factor 
2/3  or  less. 

Rate  of  convergence 

Since  =  0  because  of  the  boundary  conditions,  it  follows  from  (2.4.5)  that 

|rf|oo  <  llAe'^loo  (2.4.11) 


since  =  e®  (no  pre- smoothing). 
From  (2.4.9)  it  follows  that 


Ae\j 

= 

f  dzj+i  -  -  ... 

2% 

(2.4.12) 

Hence,  using  (2.4.10), 

\Ae\j\  < 

(2.4.13) 

Substitution  of  (2.4.11)  gives 

(2.4.14) 

where  r  =  Ae  is  the  residual.  This  shows  that  the  maximum  norm  is  reduced  by  a  factor  of 
5/12  or  better,  independent  of  the  mesh-size. 


This  type  of  analysis  is  restricted  to  the  particular  case  at  hand.  More  general  cases  will  be 
treated  later  by  Fourier-analysis.  There  a  drawback  is  of  course  the  assumption  of  periodic 
boundary  conditions.  The  general  proofs  of  rate  of  convergence  referred  to  in  the  introduction 
do  not  give  sharp  estimates.  Therefore  the  sharper  estimates  obtainable  by  Fourier  analysis 
are  more  useful  for  debugging  codes.  On  the  sharpness  of  rate  of  convergence  predictions 
based  on  Fourier  analysis,  see  [24]. 

Again:  the  essential  principle 

How  is  the  essential  principle  of  multigrid,  discussed  in  Section  2.2,  recognized  in  the  foregoing 
analysis?  Equations  (2.4.6)  and  (2.4.7)  show  that 

=  0  (2.4.15) 

Application  of  R  means  taking  a  local  average  with  positive  weights;  thus  (2.4.15)  implies 
that  has  many  sign  changes,  and  is  therefore  rough.  Since  —  Av?!^  —  /  is 

the  residual,  we  see  that  after  coarse  grid  correction  the  residual  is  rough.  The  smoother 
is  efficient  in  reducing  this  non-smooth  residual  further,  which  explains  the  /i-independent 
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reduction  shown  in  (2.4.14). 

Exercise  2.4.1  In  the  definition  of  G  (2.2.2)  and  G  (2.3.1)  we  have  not  included  the  point 
X  =  0,  where  a  Dirichlet  condition  holds.  If  Neumann  condition  is  given  at  x  =  0,  the  point 
X  =  0  must  be  included  in  G  and  G.  K  one  wants  to  write  a  general  multigrid  program  for 
both  cases,  x  =  0  has  to  be  included.  Repeat  the  foregoing  analysis  of  the  two-grid  algorithm 
with  X  =  0  included  in  G  and  G.  Note  that  including  x  =  0  makes  A  non-symmetric.  This 
difficulty  does  not  occur  with  cell-centered  discretization,  to  be  discussed  in  the  next  chapter. 


3  Basic  Iterative  Methods 

3.1  Introduction 

Smoothing  methods  in  multigrid  algorithms  are  usually  taken  from  the  class  of  basic  iterative 
methods,  to  be  defined  below.  This  chapter  presents  an  introduction  to  these  methods.  A 
more  detailed  account  may  be  found  in  [141]. 


Basic  iterative  methods 

Suppose  that  discretization  of  the  partial  differential  equation  to  be  solved  leads  to  the  fol¬ 
lowing  linear  algebraic  system 

Ay:=b  (3.1.1) 


Let  the  matrix  A  be  split  as 


A  =  M-N 


(3.1.2) 


with  M  non-singular.  Then  the  following  iteration  method  for  the  solution  of  (3.1.1)  is  called 
a  basic  iterative  method: 


=  Ny^  +  b 

(3.1.3) 

or 

y^+^  =  Sy^  -f  Tb 

(3.1.4) 

with 

S  =  M-^N,  T  =  M-^ 

(3.1.5) 

so  that  we  have 

y”^+^  =  Sy'^  +  M-'^b  ,  S  =  ,  N  =  M-A 

(3.1.6) 

The  matrix  S  is  called  the  iteration  matrix  of  iteration  method  (3.1.6). 

Basic  iterative  method  may  be  damped,  by  modifying  (3.1.6)  as  follows 

y*  =  Sy'^  -1-  M-^b 

-)-  (1  —  (jv)y'^ 

(3.1.7) 
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By  elimination  of  y*  one  obtains 


ym+l  ^  gSym  ^  (3.1.8) 

with 

S*=uS+{l-u)I  (3.1.9) 

The  eigenvalues  of  the  undamped  iteration  matrix  S  and  the  damped  iteration  matrix  S*  are 
related  by 

A(S*)  =  wA(S')  + 1 -w  (3.1.10) 

Although  the  possibility  that  a  divergent  method  (3.1.6)  or  (3.1.8)  is  a  good  smoother  (a  con¬ 
cept  to  be  explained  later)  cannot  be  excluded,  the  most  likely  candidates  for  good  smoothing 
methods  are  to  be  found  among  convergent  methods.  In  the  next  section,  therefore,  some 
residts  on  convergence  of  basic  iterative  methods  are  presented.  For  more  background,  see 
[129]  and  [151]. 

Exercise  3.1.1  Show  that  (3.1.8)  corresponds  to  the  splitting 

M*  =  Mloj,  N*  =  A-M*  (3.1.11) 

3.2  Convergence  of  basic  iterative  methods 

Convergence 

In  the  convergence  theory  for  (3.1.3)  the  following  concepts  play  an  important  role.  We  have 
My  =  Ny  ■+■  b,  so  that  the  error  e”*  =  y*”  -  y  satisfies 

6”*+^  =  (3.2.1) 

As  shown  in  many  textbooks,  we  have 

Theorem  3.2.2  Convergence  of  (3.1.3)  is  equivalent  to 

p{S  <  1  (3.2.2) 


with  p{S)  the  spectral  radius  of  S. 

Regular  splittings  and  M-  and  R-matrices 

Definition  3.2.2  The  splitting  (3.1.2)  is  called  regular  if  M~^  >  0  and  iV  >  0  (elementwise). 
The  splitting  is  convergent 'wheri  (3.1.3)  converges. 

Definition  3.2.3  ([129],  Definition  3.3).  The  matrix  A  is  called  an  M-matrix  if  aij  <  0  for 


13 


all  i,j  with  i  ^  j,  A  is  non-singular  and  A~^  >  0  (elementwise). 

Theorem  3.2.3  A  regular  splitting  of  am  M-matrix  is  convergent. 

Proof.  See  [129]  Theorem  3.13. 

Unfortunately,  a  regular  splitting  of  an  M-matrix  does  not  necessary  give  a  smoothing  method. 
A  counterexample  is  the  Jacobi  method  (to  be  discussed  shortly)  applied  to  Laplace’s  equa¬ 
tion  (see  later).  In  practice,  however,  it  is  easy  to  find  good  smoothing  methods  if  A  is  an 
M-matrix.  As  discussed  in  [145],  a  convergent  iterative  method  can  always  be  turned  into  a 
smoothing  method  by  introduction  of  damping.  We  will  find  later  that  often  the  efficacy  of 
smoothing  methods  to  be  enhanced  significantly  by  damping.  Damped  version  of  the  methods 
to  be  discussed  are  obtained  easily,  using  equations  (3.1.8),  (3.1.9)  and  (3.1.10). 

Hence,  it  is  worthwhile  to  try  to  discretize  in  such  way  that  the  resulting  matrix  A  is 
an  M-matrix.  In  order  to  make  it  easy  to  see  if  a  discretization  matrix  is  an  M-matrix  we 
present  the  following  theorem. 

Definition  3.2.4  A  matrix  A  is  called  irreducible  if  from  (3.1.1)  one  cannot  extract  a  sub¬ 
system  that  can  be  solved  independently. 

Definition  3.2.5  A  matrix  A  is  called  a  K-matrix  if 

an  >  0,  Vt,  (3.2.3) 

aij  <  0,  with  i^j  (3.2.4) 

and 

Vi,  (3.2.5) 

3 

with  strict  inequality  for  at  least  one  i. 

Theorem  3.2.4  An  irreducible  A-matrix  is  an  M-matrix. 

Proof.  See  [141]. 

Note  that  inspection  of  the  A-matrix  property  is  easy. 

The  following  theorem  is  helpful  in  the  construction  of  regular  splittings. 

Theorem  3.2.5  Let  A  be  an  M-matrix.  If  M  is  obtained  by  replacing  certain  elements 
with  i  7^  y  by  values  bij  satisfying  <  0,  then  A  =  M  -  N  is  a.  regular  splitting. 
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Proof.  This  theorem  is  an  easy  generalization  of  Theorem  3.14  in  [129]  suggested  by  Theorem 
2.2  in  [87].  □ 


The  basic  iterative  methods  to  be  considered  all  result  in  regular  splittings,  and  lead  to 
numerically  stable  algorithms,  if  A  is  an  M-matrix.  This  is  one  reason  why  it  is  advisable 
to  discretize  the  partial  differential  equation  to  be  solved  in  such  a  way  that  the  resulting 
matrix  is  an  M-matrix.  This  may  require  upwind  differencing  for  first  derivatives.  Another 
reason  is  the  exclusion  of  numerical  wiggles  in  the  computed  solution,  because  a  monotonicity 
principle  is  dissociated  with  the  M-matrix  property. 

3.3  Examples  of  basic  iterative  methods:  Jacobi  and  Gauss-Seidel 

We  present  a  number  of  (mostly)  common  basic  iterative  methods  by  defining  the  correspond¬ 
ing  splittings  (3.1.2).  We  assume  that  A  arises  from  a  discretization  on  a  two-dimensional 
structured  grid. 

Point  Jacobi.  M  =  diag  (A). 

Block  Jacobi.  M  is  obtained  from  A  by  replacing  for  aU  i,j  with  j  ^  i,i  ±  1  by  zero. 
With  the  forward  ordering  of  the  grid  points  of  Figure  3.3.1  this  gives  horizontal  line  Jacobi; 
with  the  forward  vertical  line  ordering  of  Figure  3.3.2  one  obtains  vertical  line  Jacobi.  One 
horizontal  line  Jacobi  iteration  followed  by  one  vertical  line  Jacobi  iteratin  gives  alternating 
Jacobi. 
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Figure  3.3.1:  Grid  point  orderings  for  point  Gauss-Seidel. 


Point  Gauss-Seidel.  M  is  obtained  from  A  replacing  aij  for  all  i,j  with  j  >  ihy  zero. 
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Figure  3.3.2:  Grid  point  orderings  for  block  Gauss-Seidel. 

Block  Gauss-Seidel.  M  is  obtained  from  A  by  replacing  Ojj  for  aU  i,j  with  y  >  *  +  1  by 
zero. 

From  Theorem  4.2.8  it  is  immediately  clear  that,  if  A  is  an  M-matrix,  then  the  Jacobi  and 
Gauss-seidel  methods  correspond  to  regular  splittigs. 

Gaus-Seidel  variants 

It  turns  out  that  the  efficiency  of  Gauss- Seidel  methods  depends  strongly  on  the  ordering  of 
equations  and  unknowns  in  many  applications.  Also,  the  possibilities  of  vectorized  and  par¬ 
allel  computing  depend  strongly  on  this  ordering.  We  now,  therefore,  discuss  some  possible 
orderings.  The  equations  and  unknowns  are  associated  in  a  natural  way  with  points  in  a 
computational  grid.  It  suffices,  therefore,  to  discuss  orderings  of  computational  grid  points. 
We  restrict  ourselves  to  a  two-dimensional  grid  G,  which  is  enough  to  illustrate  the  basic 
ideas.  G  is  defined  by 

G  =  {(i,i) :  i  =  l,2,...,/;i  =  1,2,...,  J}  (3.3.1) 

The  points  of  G  represent  either  vertices  or  cell  centres,  depending  on  the  discretization 
method. 

Forward  or  lexicographic  ordering 

The  grid  points  are  numbered  as  follows 


k  =  1)/  (3.3.2) 

Backward  ordering 

This  ordering  corresponds  to  the  enumeration 

k  =  IJ  +  -\)I  (3.3.3) 
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White-black  ordering 

This  ordering  corresponds  to  a  chessboard  colouring  of  G,  numbering  first  the  black  points 
and  then  the  white  points,  or  vice  versa;  cf.  Figure  3.3.1. 

Diagonal  ordering 

The  points  are  numbered  per  diagonal,  starting  in  a  corner;  see  Figure  3.3.1.  Diferent  variants 
are  obtained  by  starting  in  different  corners.  If  the  matrix  A  corresponds  to  a  discrete  oper¬ 
ator  with  a  stencil  as  in  Figure  3.3.3(b),  then  point  Gauss-Seidel  with  the  diagonal  ordering 
of  Figure  3.3.1  is  mathematically  equivalent  to  forward  Gauss-Seidel. 


(») 


(b) 


(c) 


Figure  3.3.3:  Discretization  stencils. 


Point  Gauss- Seidel- Jacobi 

We  propose  this  variant  in  order  to  facilitate  vectorized  and  parallel  computing;  more  on  this 
shortly.  M  is  obtained  from  A  by  replacing  a,j  by  zero  except  an  and  We  caU  this 

point  Gauss-Seidel-Jacobi  because  this  is  a  compromise  between  the  point  Gauss-Seidel  and 
Jacobi  methods  discussed  above.  Four  different  methods  are  obtained  with  the  following  four 
orderings:  the  forward  and  backward  orderings  of  Figure  3.3.1,  the  forward  vertical  line  or¬ 
dering  of  Figure  3.3.2,  and  this  last  ordering  reversed.  Applying  these  methods  in  succession 
results  in  four-direction  point  Gauss-Seidel-Jacobi. 

White-black  line  Gauss-Seidel 

This  can  be  seen  as  a  mixture  of  lexicographic  and  white-black  ordering.  The  concept  is  best 
illustrated  with  a  few  examples.  With  horizontal  forward  white-black  Gauss-Seidel  the  grid 
points  are  visited  horizontal  line  by  horizontal  line  in  order  of  increasing  j  (forward),  while 
per  line  the  grid  points  are  numbered  in  white-black  order,  cf.  Figure  3.3.1.  The  lines  can  also 
be  taken  in  order  of  decreasing  j,  resulting  in  horizontal  backward  white-black  Gaus-Seidel. 
Doing  one  after  the  other  gives  horizontal  symmetric  white-back  Gauss-Seidel.  Doing  one 
after  the  other  gives  horizontal  symmetric  white-black  Gauss-Seidel.  The  lines  can  also  be 
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taken  vertically;  Figure  3.3.1  illustrates  vertical  backward  white-black  Gauss-Seidel.  Combin¬ 
ing  horizontal  and  vertical  symmetric  whte-black  Gauss-Seidel  gives  alternating  white-black 
Gauss-Seidel.  White- black  line  Gauss-Seidel  ordering  has  been  proposed  in  [128]. 

Orderings  for  block  Gauss-Seidel 

With  block  Gauss-Seidel,  the  unknowns  corresponding  to  lines  in  the  grid  are  updated  simul¬ 
taneously.  Forward  and  backward  horizontal  line  Gauss-Seidel  correspond  to  the  forward  and 
backward  ordering,  respectively,  in  Figure  3.3.1.  Figure  3.3.2  gives  some  more  orderings  for 
block  Gauss-Seidel. 

Symmetric  horizontal  line  Gauss-Seidel  is  forward  horizontal  line  Gauss-Seidel  followed 
by  backward  horizontal  line  Gauss-Seidel,  or  vice  versa. Alternating  zebra  Gauss-Seidel  is 
horizontal  zebra  followed  by  vertical  zebra  Gauss-Seidel,  or  vice  versa.  Other  combinations 
come  to  mind  easily. 

Vectorized  and  parallel  computing 

The  basic  iterative  methods  discussed  above  differ  in  their  suitability  for  computing  with 
vector  or  parallel  machines.  Since  the  updated  quantities  are  mutually  independent,  Jacobi 
parallizes  and  vectorizes  completely,  with  vector  length  I  *  J.  If  the  structure  of  the  stencil 
[A]  is  as  in  Figure  3.3.3(c),  then  with  zebra  Gauss-Seidel  the  updated  blocks  are  mutually 
independent,  and  can  be  handled  simultaneously  on  a  vector  or  a  parallel  machine.  The 
same  is  true  for  point  Gauss-Seidel  if  one  chooses  a  suitable  four-colour  ordering  scheme. 
The  vector  length  for  horizontal  or  vertical  zebra  Gauss-Seidel  is  J  or  /,  respectively.  The 
white  and  black  groups  in  white-black  Gauss-Seidel  are  mutually  independent  if  the  structure 
of  [A]  is  given  by  Figure  3.3.4.  The  vector  length  is  I  *  J/2.  With  diagonal  Gauss-Seidel, 
the  points  inside  a  diagonal  are  mutually  independent  if  the  structure  of  [A]  is  given  by 
Figure  3.3.3(b),  if  the  diagonals  are  chosen  as  in  Figure  3.3.1.  The  same  is  true  when  [A] 
has  the  structure  given  in  Figure  3.3.3(a),  if  the  diagonals  are  rotated  by  90°.  The  average 
vector  length  is  roughly  7/2  or  J/2,  depending  on  the  length  of  largest  the  diagonal  in  the 
grid.  With  Gauss-Seidel-Jacobi  lines  in  the  grid  can  be  handled  in  paraDel;  for  example,  with 
the  forward  ordering  of  Figure  3.3.1  the  points  on  vertical  lines  Gauss-Seidel  points  of  the 
same  colour  can  be  updated  simultaneously,  resulting  in  a  vector  length  of  7/2  or  J/2,  as  the 
ca.se  may  be. 

Exercise  3.3.1  Let  A  =  L  +  D  +  U,  with  kj  =  0  for  y  >  i,  D  =  diag  (A),  and  Uij  =  0  for 
j  >  i.  Show  that  the  iteration  matrix  of  symmetric  point  Gauss-Seidel  is  given  by 

5=  (£/-f-£))-^i(L  +  £>)-^17  (3.3.4) 

Exercise  3.3.2  Prove  Theorem  3.3.1. 
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Figure  3.3.4:  Five-point  stencil, 

3.4  Examples  of  basic  iterative  methods:  incomplete  point  LU  factoriza¬ 
tion 

Complete  LU  factorization 

When  solving  Ay  =  b  directly,  a  factorization  A  =  LU  is  constructed,  with  L  and  U  a  lower 
and  an  upper  triangular  matrix.  This  we  call  complete  factorization.  When  A  represents  a 
discrete  operator  with  stencil  structure,  for  example,  as  in  Figure  3.3.3,  then  L  and  U  turn 
out  to  be  much  less  sparse  than  A,  which  renders  this  method  ineificient  for  the  class  of 
problems  under  consideration. 

Incomplete  point  factorization 

With  incomplete  factorization  or  incomplete  LU  factorization  (ILU)  one  generates  a  splitting 
A  =  M  -  N  with  M  having  sparse  and  easy  to  compute  lower  and  upper  triangular  factors 
L  and  U: 

M^LU  (3.4.1) 

If  A  is  symmetric  one  chooses  a  symmetric  factorization: 

M  =  LL^  (3.4.2) 

An  alternative  factorization  of  M  is 

M  =  LD-^U  (3.4.3) 

With  incomplete  point  factorization,  D  is  chosen  to  be  a  diagonal  matrix,  and  diag  (L)  = 
diag  (J7)  =  D,  so  that  (3.4.3)  and  (3.4.1)  are  equivalent.  L,D  and  U  are  determined  as 
follows.  A  graph  Q  of  the  incomplete  decomposition  is  defined,  consisting  of  two-tuples  (i,i) 
for  which  the  elements  Uj,  da  and  Uij  ae  allowed  to  be  non-zero.  Then  L,  D  and  U  are  defined 
by 

{LD-^U)ki  =  aki,  'i{k,l)eG  (3.4.4) 

We  will  discuss  a  few  variants  of  ILU  factorization.  These  result  in  a  splitting  A  —  M  —  N 
with  M  =  LD~^U.  Modified  incomplete  point  factorization  is  obtained  if  D  is  defined  by 
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(3.4.4)  is  changed  to  U  +  <tD,  with  <t  €  iR  a  parameter,  and  D  a  diagonal  matrix  defined 
by  dkk  =  Z)  From  now  on  the  modified  version  will  be  discussed,  since  the  unmodified 

version  follows  as  a  special  case.  This  or  similar  modifications  have  been  investigated  in  the 
context  of  multigrid  methods  in  [65],  [97],  [83],  [82]  and  [145],  [147].  A  srvey  is  given  in  [142]. 
We  will  discuss  a  few  variants  of  modified  ILU  factorization. 

Five-point  ILU 

Let  the  grid  be  given  by  (3.3.1),  let  the  grid  points  be  ordered  according  to  (3.3.2),  and  let 
the  structure  of  the  stencil  be  given  by  Figure  3.3.3.  Then  the  graph  of  A  is 

G  =  {{k,  k  -  /),  {k,  k  -  1),  (k,  k),  {k,  k  +  1),  (k,  k  +  /)}  (3.4.5) 

For  brevity  the  following  notation  is  introduced 

o-k  -  ak,k-l,  Ck  =  ak,k-i,  dk  =  Ukk,  qk  =  9k  =  afc,fc+/  (3.4.6) 

Let  the  graph  of  the  incomplete  factorization  be  given  by  (3.4.5),  and  let  the  non-zero  elements 
of  L,D  and  U  be  called  ak,jk,hi9'k  and  rjk]  the  locations  of  these  elements  are  identical 
to  those  of  ak,.—,9k,  respectively.  Because  the  graph  contains  five  elements,  the  resulting 
method  is  called  five-point  ILU.  Let  a,  ...,7/  be  the  IJ  *  IJ  matrices  with  elements  otk, 


respectively,  and  similarly  for  a,...,g.  Then  one  can  write 

LD~^U  =  a  +  'y-\-6-\-fJ,  +  T]-\-  (3.4.7) 

From  (3.4.4)  it  follows 

a  =  a,  7  =  c,  //  =  g,  r]  =  g  (3.4.8) 

and,  introducing  modification  as  described  above, 

6 -{■  aS~^g -{■  cS~^g  =  d-i- ad  (3.4.9) 

The  rest  matrix  N  is  given  by 

N  =  a6~^q -\- cS~^g  +  ad  (3.4.10) 

The  only  non-zero  entries  of  N  are 

'l^k,k-I+l  —  O-k^k-I^^-t  ‘>  '^k,k+I-l  —  ^k^k-\9k-\  (3.4.11) 

nkk  =  cr(|nM-/+ll  + 


Here  and  in  the  following  elements  in  which  indices  outside  the  range  [1,//]  occur  are  to  be 
replaced  by  zero.  From  (3.4.9)  the  following  recursion  is  obtained: 

^k  ~  dk  —  ak^ k—jgk—I  ~  1  d"  f^kk  (3.4.12) 
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This  factorization  has  been  studied  in  [41]. 

From  (3.4.12)  it  follows  that  a  can  overwrite  d,  so  that  the  only  additional  storage  required 
is  for  N.  When  required,  the  residual  6  -  can  be  computed  as  Mows  without  using 

A: 

b  -  -  y)  (3.4.13) 

which  Mows  easily  from  (3.1.3).  Since  N  is  usually  more  sparse  than  A,  (3.4.13)  is  a  cheap 
way  to  compute  the  residual.  For  aU  methods  of  type  (3.1.3)  one  needs  to  store  only  M  and 
N,  and  A  can  be  overwritten. 

Seven-point  ILU 

The  terminology  seven-point  IL  U  indicates  that  the  graph  of  the  incomplete  factorization  has 
seven  elements.  The  graph  Q  is  chosen  as  follows: 

Q  =  {{k,  k±I),{k,k±lT  1),  (fc,  k  ±  1),  {k,  k)}  (3.4.14) 

For  the  computation  of  L,D  and  U  see  [141].  L,DandU  can  overwrite  A.  The  only 
additional  storage  required  is  for  N.  Or,  if  one  prefers,  elements  of  N  can  be  computed  when 
needed. 

Nine-point  ILU 

The  principles  are  the  same  as  for  five-  and  seven-point  ILU.  Now  the  graph  G  has  nine 
elements,  chosen  as  follows 

G  =  g^{j{{k,k±I±l)}  (3.4.15) 

with  Gx  given  by  (3.4.14). 

For  the  computation  oi  L,D  and  U  see  [141]. 

Alternating  ILU 

Alternating  ILU  consists  of  one  ILU  iteration  of  the  type  just  discussed  or  similar,  followed 
by  a  second  ILU  iteration  based  on  a  different  ordering  of  the  grid  points.  As  an  example,  let 
the  grid  be  defined  by  (3.3.1),  and  let  the  grid  points  be  numbered  according  to 

k  =  IJ  +  l-j-{i-l)J  (3.4.16) 

This  ordering  is  illustrated  in  Figure  3.4.1,  and  will  be  called  here  the  second  backward  or¬ 
dering,  to  distinguish  it  from  the  backward  ordering  defined  by  (3.3.3).  The  ordering  (3.4.16) 
will  turn  out  to  be  preferable  in  applications  to  be  discussed  later.  The  computation  of  the 
corresponding  incomplete  factorization  factors  L,  D  and  U  is  discussed  in  [141].  If  alternating 
ILU  is  used,  L,D  and  U  are  already  stored  in  the  place  of  A,  so  that  additional  storage  is 
required  for  L,  D  and  U.  N  can  be  stored,  or  is  easily  computed,  as  one  prefers. 
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Figure  3.4.1:  Illustration  of  second  backward  ordering. 


General  ILU 

Other  ILU  variant  are  obtained  for  other  choices  of  Q.  See  [88]  for  some  possibilities.  In 
general  it  is  advisable  to  choose  Q  equal  to  or  slightly  larger  than  the  graph  of  A.  If  ^  is 
smaller  that  the  graph  of  A  then  nothing  changes  in  the  algorithms  just  presented,  except 
that  the  elements  of  A  outside  Q  are  subtracted  from  N. 

The  following  algorithm  computes  an  ILU  factorization  for  general  Q  by  incomplete  Gauss 
elimination.  A  is  an  n  x  n  matrix.  We  choose  diag  (L)  =  diag  (U). 

Algorithm  1.  Incomplete  Gauss  elimination 

A°  :=  A 

for  r  :=  1  step  1  until  n  do 
begin  :=  sqrt  {al~'^) 

for  j  >  r  A  {r,j)  e  G  do  a^j  :=  alj^/al,. 
for  *  >  r  A  (i,  r)eG  do  at.  := 

for  {iij)  €  G  Ai  >  r  Aj  >  r  A{i,r)  €  G  A  (r,j)  £  G  do 

<j  •=  <7^  -  <r<j 

od  od  od 
end  of  algorithm  1. 

A"  contains  L  and  U.  In  [57]  one  finds  an  algorithm  for  the  LD~^U  version  of  ILU,  for 
arbitrary  G-  See  [143]  and  [138]  for  applications  of  ILU  with  a  fairly  complicated  G  (Navier- 
Stokes  equations  in  the  vorticity-stream  function  formulation). 

Final  remarks 

Existence  of  ILU  factorizations  and  numerical  stability  of  the  associated  algorithms  has  been 
proved  in  [87]  if  A  is  an  M-matrix;  it  is  also  shown  that  the  associated  splitting  is  regular,  so 
that  ILU  converges  according  to  Theorem  4.2.3.  For  information  on  efficient  implementations 
of  ILU  on  vector  and  parallel  computers,  see  [69],  [68],  [116],  [117],  [118],  [119],  [103]  and  [14]. 
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Exercise  4.4.1  Derive  algorithms  to  compute  symmetric  ILU  factorizations  A  =  LD  — 
N  and  A  =  LL^  —  N  ior  A  symmetric.  See  [87]. 

Exercise  4.4.2  Let  A  =  L  -\-  D  -\-U ,  with  D  —  diag  (A),  Uj  =  0,  j  >  i  and  Uij  =  0,  y  <  i. 
Show  that  (3.4.3)  results  in  symmetric  point  Gauss-Seidel  (cf.  Exercise  3.3.1).  This  shows 
that  symmetric  point  Gauss-Seidel  is  a  special  instance  of  incomplete  point  factorization. 


3.5  Examples  of  basic  iterative  methods:  incomplete  block  LU  factoriza¬ 
tion 

Complete  line  LU  factorization 

The  basic  idea  of  incomplete  block  LU- factorization  (IBLU)  (also  called  incomplete  line  LU- 
factorization  (ILLXJ)  in  the  literature)  is  presented  by  means  of  the  following  example.  Let 
the  stencil  of  the  difference  equations  to  be  solved  be  given  by  Figure  3.3.3(c).  The  grid  point 
ordering  is  given  by  (3.3.2).  Then  the  matrix  A  of  the  system  to  be  solved  is  as  follows: 


with  Lj,  Bj  and  U j  I  x  I  tridiagonal  matrices. 
First,  we  show  that  there  is  a  matrix  D  such  that 


A  =  {L-\-D)D-\D-^U) 


(3.5.2) 
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We  call  (3.5.2  a  line  LU  factorization  of  A,  because  the  blocks  in  L,  D  and  U  correspond  to 
(in  our  case  horizontal)  Unes  in  the  computational  grid.  From  (3.5.2)  it  follows  that 

A  =  L  +  D  +  U  +  LD-^U  (3.5.4) 

One  finds  that  LD~^U  is  the  following  block- diagonal  matrix 

0 

LjDfl.Uj.r 

From  (3.5.4)  and  (3.5.5)  the  following  recursion  to  compute  D  is  obtained 

Di  =  Bi  ,  Dj  =  Bj  -  LjDjf^Uj  ,  j  =  2, 3, ...,  J  (3.5.6) 

Provided  DJ^  exists,  this  shows  that  one  can  find  D  such  that  (3.5.2)  holds. 

Nine-point  IBLU 

The  matrices  Dj  are  full;  therefore  incomplete  variants  of  (3.5.2)  have  been  proposed.  An 
incomplete  variant  is  obtained  by  replacing  LjDJ^^U j  in  (3.5.6)  by  its  tridiagonal  part  (i.e. 
replacing  all  elements  with  indices  m  with  m  ^  z,  i  ±  1  by  zero): 

bi  =  Bi  ^  bj  =  Bj  —  tridiag  {LjbjliU j)  (3.5.7) 

The  IBLU  factorization  of  A  is  defined  as 

A  =  {L  +  D)D~\d  +  U)-N  (3.5.8) 

There  are  three  non-zero  elements  per  row  m  L,  D  and  U ;  thus  we  call  this  nine-point  IBLU. 
For  an  algorithm  to  compute  D  and  D  see  [141]. 

The  IBLU  iterative  method 

With  IBLU,  the  basic  iterative  method  (3.1.3)  becomes 

r  =  h-  Ay^  (3.5.9) 

{L  +  b)b~\b  -\-  l/)y’”+^  =  r  (3.5.10) 

j/™+’  :=  (3.5.11) 

Equation  (3.5.10)  is  solved  as  follows 

Solve  (L  +  b)y^+^  =  r  (3.5.12) 
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r  :=  (3.5.13) 

Solve  (b  +  =  r  (3.5.14) 

With  the  block  partioning  used  before,  and  with  yj  and  rj  denoting  /-dimensional  vectors 
corresponding  to  block  j,  Equation  (3.5.12)  is  solved  as  follows: 

Di2/i"+'=ri,  i  =  2,3,...,J  (3.5.15) 

Equation  (3.5.14)  is  solved  in  a  similar  fashion. 

Other  IBLU  variants 

Other  IBLU  variants  are  obtained  by  taking  other  graphs  for  L^D  and  U.  When  A  corre¬ 
sponds  to  the  five-point  stencil  of  Figure  3.3.3,  L  and  U  are  diagonal  matrices,  resulting  in  the 
five-point  IBLU  variants.  When  A  corresponds  to  the  seven-point  stencils  of  Figure  3.3.3(a), 
(b),  L  and  U  are  bidiagonal,  resulting  in  seven-point  IBLU.  There  are  also  other  possibilities 
to  approximate  LjD-^^Uj  by  a  sparse  matrix.  See  [6],  [33],  [7],  [99],  [107]  for  other  versions 
of  IBLU;  the  first  three  publications  also  give  existence  proofs  for  Dj  if  A  is  an  M-matrix;  this 
condition  is  slightly  weakened  in  [99].  Vectorization  and  parallelization  aspects  are  discused 
in  [7]. 

Exercise  3.5.1  Derive  an  algorithm  to  compute  a  symmetric  IBLU  factorization  A  = 
{L  +  D)D  ^{D  +  L'^)  —  N  for  a  symmetric.  See  [33]. 

3.6  Some  methods  for  non-M-matrices 

When  non-self-adjoint  partial  differential  equations  are  discretized  it  my  happen  that  the 
resulting  matrix  A  is  not  an  M-matrix.  This  depends  on  the  type  of  discretization  and 
the  values  of  the  coefficients.  Examples  of  other  applications  leading  to  non-M-matrix  dis¬ 
cretizations  are  the  biharmonic  equation  and  the  Stokes  and  Navier- Stokes  equations  of  fluid 
dynamics. 

Defect  correction 

Defect  correction  can  be  used  when  one  has  a  second-order  accurate  discretization  with  a 
matrix  A  that  is  not  an  M-matrix,  and  a  first-order  discretization  with  a  matrix  B  which  is 
an  M-matrix,  for  example  because  B  is  obtained  with  upwind  discretization,  or  because  B 
contains  artificial  viscosity.  Then  one  can  obtain  second-order  results  as  follows. 

Algorithm  L  Defect  correction 
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begin  Solve  By  =  b 

for  i  :=  1  step  1  until  n  do 
By  =  b  -  Ay  +  By 
y  •=y 

od 

end  of  algorithm  1. 

It  suffices  in  practice  to  take  n  =  1  or  2.  For  simple  problems  it  can  be  shown  that  for  n  =  1 
already  y  has  second-order  accuracy.  B  is  an  M-matrix;  thus  the  methods  discussed  before 
can  be  used  to  solve  for  y. 

Distributive  iteration 

Instead  of  solving  Ay  =  b  one  may  also  solve 

ABy  -  b,  y  -  By  (3.6.1) 

This  may  be  called  post-conditioning,  in  analogy  with  preconditioning,  where  one  solves 
BAy  =  Bb.  B  is  chosen  such  that  AB  is  an  M-matrix  or  a  small  perturbation  of  an 
M-matrix,  such  that  the  splitting 

AB  =  M-N  (3.6.2) 


leads  to  a  convergent  iteration  method.  From  (3.6.2)  follows  the  following  splitting  for  the 
original  matrix  A 


A  =  MB-^  -  NB-^ 

(3.6.3) 

This  leads  to  the  following  iteration  method 

=  ArS'^2/™  -H  b 

(3.6.4) 

or 

ym-i  =  J,™  +  BM-^{b-  Ay'^) 

(3.6.5) 

The  iteration  method  is  based  on  (3.6.3)  rather  that  on  (3.6.2),  because  if  M  is  modified  so 
that  (3.6.2)  does  not  hold,  then,  obviously,  (3.6.5)  stiU  converges  to  the  right  solution,  if  it 
converges.  Such  modifications  of  M  occur  in  applications  of  post- conditioned  iteration  to  the 
Stokes  and  Navier-Stokes  equations. 

Iteration  method  (3.6.4)  is  called  distributive  iteration,  because  the  correction  M~^(b  — 
Ay'^)  is  distributed  over  the  elements  of  y  by  the  matrix  B.  A  general  treatment  of  this 
approach  is  given  in  [144],  [146],  [148],  [150],  [149],  where  it  is  shown  that  a  number  of  well 
known  iterative  methods  for  the  Stokes  and  Navier-Stokes  equations  can  be  interpreted  as 
distributive  iteration  methods. 
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Taking  B  =  and  choosing  (3.6.2)  to  be  the  Gauss-Seidel  or  Jacobi  splitting  results  in 
the  Kaczmarz  [78]  or  Cimmino  [32]  methods,  respectively.  These  methods  converge  for  every 
regular  A,  because  Gauss-Seidel  and  Jacobi  converge  for  symmetric  positive  definite  matrices 
(a  proof  of  this  elementary  result  may  be  found  in  [70].  Convergence  is,  however,  usually 
slow. 

4  Smoothing  analysis 

4.1  Introduction 

The  convergence  behaviour  of  a  multigrid  algorithm  depends  strongly  on  he  smoothero  The 
efficiency  of  smoothing  methods  is  problem-dependent.  When  a  smoother  is  efficient  for  a 
large  class  of  problems  it  is  called  robust  This  concept  will  be  made  more  precise  shortly 
for  a  certain  class  of  problems.  Not  every  convergent  method  has  the  smoothing  property, 
but  for  symmetric  matrices  it  can  be  shown  that  by  the  introduction  of  suitable  amount  of 
damping  every  convergent  method  acquires  the  smoothing  property.  This  property  says  little 
about  the  actual  efficiency.  A  convenient  tool  for  the  study  of  smoothing  efficiency  is  Fourier 
analysis,  which  is  also  easily  applied  to  the  non-symmetric  case.  Fourier  smoothing  analysis 
is  the  main  topic  of  this  chapter. 

Many  different  smoothing  methods  are  employed  by  users  of  multigrid  methods.  Of  course, 
in  order  to  explain  the  basic  principles  of  smoothing  analysis  it  suffices  to  discuss  only  a  few 
methods  by  way  of  illustration.  To  facilitate  the  making  of  a  good  choice  of  a  smoothing 
method  for  a  particular  application  it  is,  however,  useful  to  gather  smoothing  analysis  results 
which  are  scattered  through  the  literature  in  one  place,  and  to  complete  the  information 
where  results  for  important  cases  are  lacking. 

4.2  The  smoothing  property 

The  smoothing  method  is  assumed  to  be  a  basic  iterative  method  as  defined  by  (3.1.3).  We 
will  assume  that  A  is  a  A-matrix.  Often,  the  smoother  is  obtained  in  the  way  described  in 
Theorem  3.2.5;  in  practice  one  rarely  encounters  anything  else. 

The  smoothing  property  is  defined  as  follows  ([57]): 

Definition  4.2.1  Smoothing  property.  5  has  the  smoothing  property  if  there  exist  a 
constant  Cs  and  a  function  ri{i/)  independent  of  the  mesh-size  h  such  that 

||A5^||  <  7]{u)  —^0  for  v  ^  oo  (4.2.1) 

where  2m  is  the  order  of  the  partial  differential  equation  to  be  solved. 
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Here  S  is  the  iteration  matrix  defined  by  (3.1.5).  The  smoothing  property  implies  converse 
[141]  but,  as  already  remarked,  the  converse  is  not  true.  In  [145]  it  is  shown  that  a  convergent 
method  can  be  turned  into  a  smoother  by  damping;  for  a  fuller  discussion  see  [141]. 

Discussion 

In  [57]  the  smoothing  property  is  shown  for  a  number  of  iterative  methods.  The  smooth¬ 
ing  property  of  incomplete  factorization  methods  is  studied  in  [145],  [147].  Non-symmetric 
problems  can  be  handled  by  perturbation  arguments,  as  indicated  by  [57].  When  the  non- 
symmetric  part  is  dominant,  however,  as  in  singular  perturbation  problems,  this  does  not 
lead  to  useful  results.  Fourier  smoothing  analysis  (which,  however,  also  has  its  limitations) 
can  handle  the  non-symmetric  case  easily,  and  also  provides  an  easy  way  to  optimize  values 
of  damping  parameters  and  to  predict  smoothing  efficiency.  The  introduction  of  damping 
does  not  necessarily  give  a  robust  smoother.  The  differential  equation  may  contain  a  param¬ 
eter,  such  that  when  it  tends  to  a  certain  limit,  smoothing  efficiency  deteriorates.  Examples 
and  further  discussion  of  robustness  will  foUow.  We  will  concentrate  on  Fourier  smoothing 
analysis. 

4.3  Elements  of  Fourier  analysis  in  grid-function  space 

As  preparation  we  start  with  the  one-dimensional  case. 

The  one-dimensional  case 

Theorem  4.3.1.  Discrete  Fourier  transform.  Let  I  =  {0, 1,2,  ...,n— 1}.  Every  u:  I  —*  M 
can  be  written  as 

^  i^ji^k)  =  exp{ij0k),  0k  =  2Trkln,  jel  (4.3.1) 

k=—m 

where  p  =  0,  m  =  (n  -  l)/2  for  n  odd  and  p  =  1,  m  =  n/2  -  1  for  n  even,  and 

=  n~^  ^  uji)j{-0k)  (4.3.2) 

3=0 

The  functions  'ip{0)  are  called  Fourier  modes  or  Fourier  components.  For  a  proof  of  this 
elementary  theorem  see  [141]. 

The  multi-dimensional  case 
Define 

=  exp{ij9) 
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(4.3.3) 


with  j  E  I,  6  €  Q,  with 

I  =  {j  '-j  -  ,id),  ja  =  0,l,2,...,na-  1,  a  =  l,2,...,d}  (4.3.4) 

0  =  {0  :  ^  =  (Oi,02,....,Od),  6a  =  2-Kkalna, 

ka  =  -ma,  -ma  +  1,  ma  -\-Pai  «  =  1, 2, ...,  d}  (4.3.5) 

where  Pa  =  0,  ma  =  (««  -  l)/2  for  Ua  odd  and  Pa  =  1,  m^  =  na/2  -  1  for  ««  even. 
Furthermore, 

d 

je  =  ^  jaOa  (4.3.6) 

Theorem  4.3.2.  Discrete  Fourier  transform  in  d  dimensions.  Every  u  :  I  ^  IR  can 

be  written  as 

uj  =  ^ce^j(0)  (4.3.7) 

^e© 

with 

d 

eg  =  N-^  N=Y[na  (4.3.8) 

jei  “=i 

For  a  proof  see  [141]. 

The  Fourier  series  (4.3.7)  is  appropriate  for  d-dimensional  vertex-  or  cell-centered  grids  with 
periodic  boundary  conditions.  For  the  use  of  Fourier  sine  or  cosine  series  to  accommodate 
Dirichlet  or  Neumann  conditions,  see  [141]. 


4.4  The  Fourier  smoothing  factor 

Definition  of  the  local  mode  smoothing  factor 
Let  the  problem  to  be  solved  on  grid  G  be  denoted  by 

Au  =  /  (4.4.1) 

and  let  the  smoothing  method  to  be  used  be  given  by  (3.1.6): 

u:=Su  +  M-^f,  S  =  M-^N,  M-N  =  A  (4.4.2) 

According  to  (3.2.1)  the  relation  between  the  error  before  and  after  i'  smoothing  iterations  is 

el  =  (4.4.3) 

We  now  make  the  following  assumption. 
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Assumption  (i).  The  operator  S  has  a  complete  set  of  eigenfunctions  or  local  modes  denoted 
by  0  £  Q,  with  0  some  discrete  index  set. 

Hence 

S‘'i}i0)  =  \^i0)^(e)  (4.4.4) 

with  A(0)  the  eigenvalue  belonging  to  we  can  write 

e“  =  ^  Co  0  =  0,1 

see 

and  obtain 

4  =  (4-4.5) 

The  eigenvalue  X{0)  is  also  called  the  amplification  factor  of  the  local  mode 

Next,  assume  that  among  the  eigenfunctions  ’ip{0)  we  somehow  distinguish  between  smooth 

eigenfunctions  {$  £  Qs)  and  rough  eigenfunctions  (6  £  0r): 

0  =  0^U0,.,  0,n0,.  =  0  (4.4.6) 

We  now  make  the  following  definition. 

Definition  4.4.1.  Local  mode  smoothing  factor.  The  local  mode  smoothing  factor  p  of 
the  smoothing  method  (4.4.2)  is  defined  by 

p  =  siip{|A(0)|  :  6  £  0r}  (4-4.7) 

Hence,  after  u  smoothings  the  amplitude  of  the  rough  components  of  the  error  are  multiplied 
by  a  factor  p‘'  or  smaller. 

Fourier  smoothing  analysis 

In  order  to  obtain  from  this  analysis  a  useful  tool  for  examining  the  quality  of  smooth¬ 
ing  methods  we  must  be  able  to  easily  determine  p,  and  to  choose  0^  such  that  an  error 
e  =  -11^(0),  0  G  0s  is  well  reduced  by  coarse  grid  correction.  This  can  be  done  if  Assumption 
(ii)  is  satisfied. 

Assumption  (ii).  The  eigenfunctions  V’(^)  of  S  are  harmonic  functions. 

This  assumption  means  that  the  series  preceding  (4.4.5)  is  a  Fourier  series.  When  this  is  so 
p  is  also  called  the  Fourier  smoothing  factor.  In  the  next  section  we  will  give  conditions  such 
that  Assumption  (ii)  holds,  and  show  how  p  is  easily  determined;  but  first  we  discuss  the 
choice  of  0r. 
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Aliasing 

Consider  the  vertex-centered  grid  G  given  by  (4.4.8)  with  even,  and  the  corresponding 
coarse  grid  G  defined  by  doubling  the  mesh-size: 

G  —  \x  G  fft  •  X  —  jhy  j  —  (jl  9  J2?  •••9  Jrf)?  ^  —  (^1  9  ^29  “••9 

joi  =  0, 1, 2, ...,  hoi  =  ifuci^  0  =  1,2, ...,  d}  (4.4.8) 

G  =  {x  e  :x  =  jh,  j  =  -Jd),  h  =  (hi,h2,...,hd), 

3a  ~  0>  1>  2, tIq.,  hoi  =  Xftioi^  o  =  1,2,  (4.4.9) 

with  Tia  =  na/2.  Let  d  =  1,  and  assume  that  the  eigenfunctions  of  S  on  the  fine  grid  G  are 
the  Fourier  modes  of  Theorem  4.3.1:  with 

0  6  0  =  {0  :  0  =  2Trk/m,  k  =  -ni/2  +  1,  -ni/2  +  2, ...,  Tii/2}  (4.4.10) 

so  that  an  arbitrary  grid  function  t;  on  G  can  be  represented  by  the  following  Fourier  series 

(4.4.11) 

ee@ 

An  arbitrary  grid  function  v  otx  G  can  be  represented  by 

=  (4.4.12) 

0e© 

with  ^{B)  :  G  ^  nt,  ^j{B)  =  exp{ij9),  and 

Q  =  {9  :6  =  2'Kk/ni,  k  =  —ni/2  +  1,  — ni/2  +  2, ...,ni/2}  (4.4.13) 

assuming  for  simplicity  that  n\  is  even.  The  coarse  grid  point  Xj  =  jh  coincides  with  the  fine 
grid  point  X2j  =  2jh.  In  these  points  the  coarse  grid  Fourier  mode  '0(0)  takes  on  the  value 

i)j{9)  =  exp{ij9)  =  exp{i2j9)  (4.4.14) 

For  -ni/4+  1  <  A:  <  ni/4  the  fine  grid  Fourier  mode  V’(^fc)  takes  on  in  the  coarse  grid  points 
Xj  the  values  of  V’2i(^fc)  =  exp(2xijk/hi)  =  ^j(27rfc/ni),  and  we  see  that  it  coincides  with 
the  coarse  grid  mode  in  the  coarse  grid  points.  But  this  is  also  the  case  for  another  fine 
grid  mode.  Define  k  as  follows 

Q<k<h\l2:  k'  =  -nxl2-\-k  <"4  4  I'll 

-ni/2<  A:<0:  k'  =  nxl2  +  k  1  •  •  J 
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Then  the  fine  grid  Fourier  mode  also  coincides  with  i}{0k)  ia  the  coarse  grid  points. 

On  the  coarse  grid,  cannot  be  distinguished  from  This  is  called  ulidsing'.  the 

rapidly  varying  function  ipiOf.')  takes  on  the  appearance  of  the  much  smoother  function 
on  the  coarse  grid. 

Smooth  and  rough  Fourier  modes 

Because  on  the  coarse  grid  G  the  rapidly  varying  function  cannot  be  approximated, 

and  cannot  be  distinguished  from  't(^{9k),  where  is  no  hope  that  the  part  of  the  error  which 
consists  of  Fourier  modes  ^  given  by  (4.4.15),  can  be  approximated  on  the  coarse 

grid  G.  This  part  of  the  error  is  called  rough  or  non-smooth.  The  rough  Fourier  modes  are 
defined  to  be  with  k'  given  by  (4.4.15),  that  is 

k'  e  {-ni/2+  1,  -ni/2  +  2,...,  -ni/4}U{ni/4,  ni/4  +  1,  ...,ni/2}  (4.4.16) 

This  gives  us  the  set  of  rough  wavenumbers  0^  =  {^  :  ^  =  2irk  /n\  :  k  according  to  (4.4.16)}, 
or 

Qj.  =  {0  :  0  =  2Trk/ni,  k  =  —ni/2  +  1,  — ni/2  +  2, ...,  n\/2 

and  0  €  [-TT, -ir/2]  U  [5r/2,7r]}  (4.4.17) 

The  set  of  smooth  wavenumbers  0^  is  defined  as  0^  =  0\0r,  0  given  by  (4.4.10)  with  d  =  1, 
or 


0s  =  {0  :  0  =  2Trk/ni,  k  =  — ni/2  +  1,  -ni/2  +  2, ...,ni/2 
and  0  e  (-7r/2,  ir/2)}  (4.4.18) 

The  smooth  and  rough  parts  Vg  and  Vr  of  a  grid  function  v  :  G  M  can  now  be  defined 
precisely  by 

(4.4.19) 

Cfl  =  «!  E  'Oji^j{-0) 

j=0 

In  d  dimensions  the  generalization  of  (4.4.17)  and  (4.4.18)  (periodic  boundary  conditions)  is 

0  =  (0  ;  0  —  (0i,^2*"")  0d)^  0  a  ~  27r  kaj  ncxf  ka  —  71q,/2  "I"  1, ...,  rio,/2} 


0s  =  0  n  n  (-7r/2,7r/2),  0,  =  0  \  0^ 

a=l 


(4.4.20) 
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Figure  4.4.1:  Smooth  (G^)  and  rough  (0,.,  hatched)  wavenumber  sets  in  two  dimensions, 
standard  coarsering. 

Figure  4.4.1  gives  a  graphical  illustration  of  the  smooth  and  rough  wavenumber  sets 
(4.4.20)  for  d  =  2.  0^  and  0s  are  discrete  sets  in  the  two  concentric  squares.  As  the 
mesh-size  is  decreased  (n^  is  increased)  these  discrete  sets  become  more  densely  distributed. 

Semi-coarsening 

The  above  definition  of  0,.  and  0s  in  two  dimensions  is  appropriate  for  standard  coarsening^ 
i.e.  G  is  obtained  from  G  by  doubling  the  mesh-size  in  aU  directions  a  =  1,2,  ...,d. 

With  semi-coarsering  there  is  at  least  one  direction  in  which  ha  in  G  is  the  same  as  in  G. 
Of  course,  in  this  direction  no  aliasing  occurs,  and  aU  Fourier  modes  on  G  in  this  direction 
can  be  resolved  on  G,  so  hey  are  not  included  in  0,*.  To  give  an  example  in  two  dimensions, 
assume  hi  =  hi  (semi-coarsering  in  the  X2-direction).  Then  (4.4.20)  is  replaced  by 

0,  ==  0  n  {[-Tt,  tt]  X  (^7r/2, 7r/2)},  0s  =  0  \  0s  (4.4.21) 

Figure  4.4.2  gives  a  graphical  illustration. 

Mesh-size  independent  definition  of  smoothing  factor 

We  have  a  smoothing  method  on  the  grid  G  if  uniformly  in  Ua  there  exists  a  p*  such  that 

p  <  p*  <  1,  Vn^,  a  =  1,2,  ...,d  (4.4.22) 

However,  p  as  defined  by  (4.4.7)  depends  on  na^  because  0,.  depends  on  Ua-  In  order  to 
obtain  a  mesh-independent  condition  which  implies  (4.4.23)  we  define  a  set  0^  D  0,.  with  0,. 
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Figure  4.4.2:  Smooth  (0*)  and  rough  (0r,  hatched)  wavenumber  sets  in  two  dimensions, 
semi- coarsening  in  X2  direction. 

independent  of  Ua  and  define 

p  =  sup{\X{0)\  :  e  e  Br}  (4.4.23) 

so  that 

p<p  (4.4.24) 

and  we  have  a  smoothing  method  if  p  <  1.  For  example,  if  0r  is  defined  by  (4.4.20),  then  we 
may  define  0^  as  follows: 

e,  =  n  [-’T.  jrl  \  n  ’r/2)  (4.4.25) 

a=l  a=l 

This  type  of  Fourier  analysis,  and  definition  (4.4.23)  of  the  smoothing  factor,  have  been 
introduced  by  Brandt  (1977).  It  may  happen  that  X{0)  still  depends  on  the  mesh-size,  in 
which  case  p  is  not  really  independent  of  the  mesh-size,  of  course. 

Modification  of  smoothing  factor  for  Dirichlet  boundary  conditions 
If  X{6)  is  smooth,  then  p-  p  =  0{h^)  for  some  m  >  1.  It  may,  however,  happen  that  there 
is  a  parameter  in  the  differential  equation,  say  £,  such  that  for  example  p  —  p  =  0{h^/e). 
Then,  for  £  <;  1  (singular  perturbation  problems),  for  practical  values  of  ha  there  may  be 
a  large  difference  between  p  and  p.  Even  if  p  =  1,  one  may  still  have  a  good  smoother. 
Large  discrepancies  between  predictions  based  on  p  and  practical  observations  may  occur 
for  singular  perturbation  problems  when  the  boundary  conditions  are  not  periodic.  It  turns 
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out  that  discrepancies  due  to  the  fact  that  the  boundary  conditions  are  not  of  the  assunoied 
type  arise  mainly  from  the  presence  or  absence  of  wavenumber  components  Oa  =  0  (present 
with  periodic  boundary  conditions,  absent  with  Dirichlet  boundary  conditions).  It  has  been 
observed  [29],  [83],  [147]  that  when  using  the  exponential  Fourier  series  (4.3.7)  for  smoothing 
analysis  of  a  practical  case  with  Dirichlet  boundary  conditions,  often  better  agreement  with 
practical  results  is  obtained  by  leaving  wavenumbers  with  Oa  =  0  out,  changing  the  definition 
of  0r  in  (4.4.7)  from  (4.4.20)  to 

=  {0  :  9  =  9a  =  ^irkalua,  #  0,  ka  =  -na/2  +  l,...,na/2} 

d 

0f  =:0^n  n(-V2,  5r/2),  0f  =  0^\0f  (4.4.26) 

Ot=\ 

where  the  superscript  D  serves  to  indicate  the  case  of  Dirichlet  boundary  conditions.  The 
smoothing  factor  is  now  defined  as 

PD  =  sup{\X{9)\:9ee^}  (4.4.27) 

Figure  4.4.3  gives  an  illustration  of  0f ,  which  is  a  discrete  set  within  the  hatched  region,  for 
d  =  2.  Further  support  for  the  usefulness  of  definitions  (4.4.26)  and  (4.4.27)  will  be  given  in 
the  next  section. 

Notice  that  we  have  the  following  inequality 

pD  <  P<P  (4.4.28) 

If  we  have  a  Neumann  boundary  condition  at  both  Xa  =  0  and  Xa  —  then  =  0  cannot 
be  excluded,  but  if  one  has  for  example  Dirichlet  at  =  0  and  Neumann  at  x^  =  1  then  the 
error  cannot  contain  a  constant  mode  in  the  x^  direction,  and  =  0  can  again  be  excluded. 


Exercise  4.4.1  Suppose  hi  =  phi  (hi  ;  mesh-size  of  (5, /ij  :  mesh-size  of  Cr,  one-dimensional 
case,  p  some  integer),  and  assume  periodic  boundary  conditions.  Show  that  we  have  aliasing 
for 

9k  =  2Trk/ni,  k  £  Z  H  {{-ni/2, -ni/2p]li  [ni/2p,ni/2]} 
and  define  sets  0^,05- 

4.5  Fourier  smoothing  analysis 

Explicit  expression  for  the  amplification  factor 

In  order  to  determine  the  smoothing  factor  p,poT  po  according  to  definitions  (4.4.7),  (4.4.23) 
and  (4.4.27)  we  have  to  solve  the  eigenvalue  problem  Sij){9)  =  X{9)-tp{9)  with  S  given  by 
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Figure  4.4.3:  Rough  wavenumber  set  (0^,  hatched)  in  two  dimensions,  with  exclusion  of 
=  0  modes;  standard  coarsening. 

(4.4.2).  Hence,  we  have  to  solve  Nip(0)  =  In  stencil  notation  (to  be  more  fuUy 

discussed  later)  this  becomes,  in  d  dimensions, 

XI  fu  £  (4.5.1) 

with  ^  =  {0,±1,±2,...}. 

We  now  assume  the  following. 

Assumption  (i).  AI{Tn,j)  and  iV(m,  j)  do  not  depend  on  m. 

This  assumption  is  satisfied  if  the  coefficients  in  the  partial  differential  equation  to  be  solved 
are  constant,  the  mesh-size  of  G  is  uniform  and  the  boundary  conditions  are  periodic.  We  write 
M{j),N{j)  instead  of  M{m,j),N{m,j).  As  a  consequence  of  Assumption  (i).  Assumption 
(ii)  of  Section  4.4  is  satisfied:  the  eigenfunctions  of  S  are  given  by  (4.3.3),  since 

Yj  N(j)exp{i{j  +  m)0]  =  exp{im0)  X  N{j)exp{ij0) 

so  that  =  exp{im0)  satisfies  (4.5.1)  with 

A(0)  =  X  N{j)exp{ij0)l  X  M{j)exp{ij0)  (4.5.2) 

jeZ’‘  jeZ^ 
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Periodicity  requieres  that  exp(imcy0a)  =  exp[i(moj +  or  exp{inoiOa)  =  1.  Hence  0  G  0, 

as  defined  by  (4.3.5),  assuming  to  be  even.  Hence,  the  eigenfunctions  are  the  Fourier 
modes  of  Theorem  4.3.2. 

Variable  coefficients,  robustness  of  smoother 

In  general  the  coefficients  of  the  partial  differential  equation  to  be  solved  wiU  be  variable,  of 
course.  Hence  Assumption  (i)  will  not  be  satisfied.  The  assumption  of  uniform  mesh-size  is 
less  demanding,  because  ofen  the  computational  grid  G  is  a  boundary  fitted  grid,  obtained 
by  a  mapping  from  the  physical  space  and  is  constructed  such  that  G  is  rectangular  and 
has  uniform  mesh  size.  This  facilitates  the  implementation  of  the  boundary  conditions  and 
of  a  multigrid  code.  For  the  purpose  of  Fourier  smoothing  analysis  the  coefficients  M(m,  j) 
and  A(m,  j)  are  locally  ‘frozen’.  We  may  expect  to  have  a  good  smoother  if  p  <  1  for  aU 
values  M{j)^N{j)  that  occur.  This  is  supported  by  theoretical  arguments  advanced  in  [57], 
Section  8.2.2. 

A  smoother  is  called  robust  if  it  works  for  a  large  class  of  problems.  Robustness  is  a 
quantitative  property,  which  can  be  defined  more  precisely  once  a  set  of  suitable  test  problems 
has  been  defined. 

Test  problems 

In  order  to  investigate  and  compare  efficiency  and  robustness  of  smoothing  methods  the 
following  two  special  cases  in  two  dimensions  are  useful 

-  {ec^  +  s^)u^u  -  2{e  -  l)csu^i2  -  22  =  0  (4.5.3) 

-  ^(^,11  +  ^^,22)  +  cw  1  +  su^2  =  0  (4.5.4) 

with  c  =  cos  /3,  5  =  sin  /3.  There  are  two  constant  parameters  to  be  varied:  £  >  0  and  /3. 
Equation  (4.5.3)  is  called  the  rotated  anisotropic  diffusion  equation^  because  it  is  obtained  by 
a  rotation  of  the  coordinate  axes  over  an  angle  /3  from  the  anisotropic  diffusion  equation: 

eu^u  -  ^,22  =  ^5  (4.5.5) 

Equation  (4,5.3)  models  not  only  anisotropic  diffusion,  but  also  variation  of  mesh  aspect  ratio, 
because  with  /J  =  0,£  =  1  and  mesh  aspect  ration  /11//12  =  discretization  results  in  the 

same  stencil  as  with  £  =  S,  /11//12  =  1  ^P^rt  from  a  scale  factor.  With  (3  ^  A:7r/2,  /?  =  0, 1, 2, 3, 
(4.5.3)  also  brings  in  a  mixed  derivative,  which  may  arise  in  practice  because  of  the  use  of  non- 
orthogonal  boundary-fitted  coordinates.  Equation  (4.5.4)  is  the  convection-diffusion  eqmXion. 
It  is  not  self-adjoint.  For  e  <C  1  it  is  a  singular  perturbation  problem,  and  is  almost  hyper¬ 
bolic.  Hyperbolic,  almost  hyperbolic  and  convection- dominated  problems  are  common  in  fluid 
dynamics. 
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Equations  (4.5.3)  and  (4.5.4)  are  not  only  useful  for  testing  smoothing  methods,  but  also 
for  testing  complete  multigrid  algorithms.  General  (as  opposed  to  Fourier  analysis)  multigrid 
convergence  theory  is  not  uniform  in  the  coefficients  of  the  differential  equation,  and  the  the¬ 
oretical  rate  of  convergence  is  not  bounded  away  from  1  as  £  |  0  or  £  ^  oo.  In  the  absence 
of  theoretical  justification,  one  has  to  resort  to  numerical  experiments  to  validate  a  method, 
and  equations  (4.5.3)  and  (4.5.4)  constitute  a  set  of  discriminating  test  problems. 

Finite  difference  discretization  results  in  the  following  stencil  for  (4.5.3),  zissuming  hi  = 
h2  =  h  and  multiplying  by  h^: 

[A]  =  {ec^  +  s^)[-l  2  -1] 

■  1  -1  0 1  r  -1 ' 

+  (f  ~  l)cs  —1  2  — 1  -f-  (£s^  -|-  c^)  2  (4.5.6) 

0  -1  1  J  [  -1  _ 

The  matrix  corresponding  to  this  stencil  is  not  a  if -matrix  (see  Definition  3.2.6)  if  £-  l)cs  >  0. 
If  that  is  the  case  one  can  replace  the  stencil  for  the  mixed  derivative  by 

■  0  1  -1  ■ 

1  -2  1  (4.5.7) 

-1  1  0  _ 

We  will  not,  however  use  (4.5.7)  in  what  follows. 


A  more  symmetric  stencil  for  [A]  is  obtained  if  the  mixed  derivative  is  approximated  by 
the  average  of  the  stencil  employed  in  (4.5.6)  and  (4.5.7),  namely 


1  0  -1 

0  0  0 

-1  0  1 


Note  that  for  [A]  in  (4.5.6)  to  correspond  to  a  AT-matrix  it  is  also  necessary  that 


(4.5.8) 


£c^  -f  -|-  (£  —  l)cs  >  0  and  es^  +  +  (s  —  l)cs  >  0 


(4.5.9) 


This  condition  wLU  be  violated  if  s  differs  enough  from  1  for  certain  values  of  c  =  cos  j3,  s  =  sin 
With  (4.5.8)  there  are  always  (if  (£-  l)cs  0)  positive  off-diagonal  elements,  so  that  we  never 
have  a  A-matrix.  On  the  other  hand,  the  ‘wrong’  elements  are  a  factor  1/2  smaller  than  with 
the  other  two  options.  Smoothing  analysis  will  show  which  of  these  variants  lend  themselves 
most  for  multigrid  solution  methods. 
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Finite  difference  discretization  results  in  the  following  stencil  for  (4.5.4),  with  h\  =  h2  =  h 
and  multiplying  by  h?: 


[A]  =  £ 


-1 

hr  .  h 

1  ■ 

a 

-1  4 

-1 

+  c-[-l  0  l]-fa- 

0 

-1 

-1 

(4.5.10) 


In  (4.5.10)  central  differences  have  been  used  to  discretize  the  convection  terms  in  (4.5.4). 
With  upwind  differences  we  obtain 


lA]  =  e 


+  ^|-c-W  2|c|  c-|c|l 


+ 


h 

2 


2|sl 

-S-  |5| 


(4.5.11) 


Stencil  (4.5.10)  gives  a  -matrix  only  if  the  well  known  conditions  on  the  mesh  Peclet  numbers 
are  fulfilled: 

\c\h/e<2,  \s\h/e<2  (4.5.12) 

Stencil  (4.5.11)  always  results  in  a  /('-matrix,  which  is  the  main  motivation  for  using  up¬ 
wind  differences.  Often,  in  applications  (for  example,  fluid  dynamics)  conditions  (4.5.12)  are 
violated,  and  discretization  (4.5.10)  is  hard  to  handle  with  multigrid  methods;  therefore  dis¬ 
cretization  (4.5.11)  wiU  mainly  be  considered. 

Definition  of  robustness 

We  can  now  define  robustness  more  precisely:  a  smoothing  methods  is  called  robust  if,  for  the 
above  test  problems,  p  <  p*  <  I  or  po  <  p*  <  1  with  p*  independent  of  e  and  h,  for  some 
ho>h>0. 
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Numerical  calculation  of  Fourier  smoothing  factor 

Using  the  explicit  expression  (4.5.2)  for  A(0),  it  is  not  difficult  to  compute  1A(^)|,  and  to  find 
its  largest  value  on  the  discrete  set  0^  or  0^  and  hence  the  Fourier  smoothing  factors  p  or 
PD.  By  choosing  in  the  definition  of  0^  (for  example  (4.4.20)  or  (4.4.21)  various  values  of 
tIq  one  may  gather  numerical  evidence  that  (4.4.22)  is  satisfied.  Computation  of  the  mesh- 
independent  smoothing  factor  p  defined  in  (4.4.23)  is  more  difficult  numerically,  since  this 
involves  finding  a  maximum  on  an  infinite  set.  In  simple  cases  p  can  be  found  analytically,  as 
we  shall  see  shortly.  Extrema  of  |A(0)|  or  0^  are  found  where  d\\(d)\ld6a  =  0,a  =  1,2,  ...,d 
and  at  the  boundary  of  0^.  Of  course,  for  a  specific  application  one  can  compute  p  for  the 
values  of  occurring  in  this  application,  without  worring  about  the  limit  oo.  In  the 

following,  we  often  present  results  for  ni  =  712  =  u  =  64.  It  is  found  that  the  smoothing 
factors  p,Pd  do  not  change  much  if  n  is  increased  beyond  64,  except  in  those  cases  where  p 
and  pD  differ  appreciably.  An  analysis  will  be  given  of  what  happens  in  those  cases. 

AH  smoothing  methods  to  be  discussed  in  this  chapter  have  been  defined  in  Section  3.3 
to  3.5. 

Local  smoothing 

Local  freezing  of  the  coefficients  is  not  realistic  near  points  where  the  coefficients  are  not 
smooth.  Such  points  may  occur  if  the  computational  grid  has  been  obtained  as  a  boundary 
fitted  coordinate  mapping  of  physical  domain  with  non-smooth  boundary.  Near  points  on 
the  boundary  which  are  the  images  of  the  points  where  the  physical  domain  boundary  is  not 
smooth,  and  where  the  mapping  is  singular,  the  smoothing  performance  often  deteriorates. 
This  effect  may  be  counterbalanced  by  performing  additional  local  smoothing  in  a  few  grid 
points  in  a  neighbourhood  of  these  singular  points.  Because  only  a  few  points  are  involved, 
the  additional  cost  is  usually  low,  apart  from  considerations  of  vector  and  parallel  computing. 
This  procedure  is  described  in  [23]  and  [9]  and  analysed  theoretically  in  [110]  and  [24]. 

4.6  Jacobi  smoothing 

Anisotropic  diffusion  equation 
Point  Jacobi 

Point  Jacobi  with  damping  corresponds  to  the  following  splitting  (cf.  Exercise  3.1.1),  in 
stencil  notation: 

M(0)  =  a;-^A(0),  M{j)  =  0,  yV  0  (4.6.1) 

Assuming  periodic  boundary  conditions  we  obtain,  using  (4.5.9)  and  (4.5.2),  in  the  special 
case  c  =  1 ,  s  =  0 

A(^)  =  1  -|-o;(£cos  B\  -  £  +  cos  62  -  1)/(1  +  £)  (4.6.2) 
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Because  of  symmetry  0r  can  be  confined  to  the  hatched  region  of  Figure  4.6.1.  Clearly, 
P  >  5r)|  =  |1  -  2a;|  >  1  for  w  ^  (0, 1).  For  w  €  (0, 1)  we  have  for  9  €  CDEF  :  A(7r,  jt)  < 


Figure  4.6.1:  Rough  wavenumbers  for  damped  Jacobi. 

A(0,7r/2),  or  1  -  2a;  <  \{9)  <  1  -  oj/{l  +  e).  For  9  £  ABCG  we  have 

A(7r,7r/2)  <  X{9)  <  A(7r/2,0), 

or  !-[(!  +  2£)/(1  +  £)]oj  <  \{9)  <  1  -  [£/(l  +  £)]a; . 


Hence 


p  =  max{|l  -  2a;|,  |1  -  ~  T 

P=(2  +  £)/(2  +  3£),  a;  =  (2  +  2£)/(2+3£) 


(4.6.3) 

(4.6.4) 


For  £  =  1  (Laplace’s  equation)  we  have  p  =  3/5,  a;  =  4/5.  For  £  -C  1  this  is  not  a  good 
smoother,  since  lim  p  =  1.  The  case  £  >  1  follows  from  the  case  £  <  1  by  replasing  £  by  l/£. 

Note  that  p  is  attained  for  0  G  0r,  so  that  here 


p  =  p 


(4.6.5) 


For  a;  =  1  we  have  p  =  1 ,  so  that  we  have  an  example  of  a  convergent  method  which  is  not  a 
smoother. 


Dirichlet  boundary  conditions 

In  the  case  of  point  Jacobi  smoothing  the  Fourier  sine  series  is  applicable  (see  [141]),  so 
that  Dirichlet  boundary  conditions  can  be  handled  exactly.  It  is  found  that  with  the  sine 
series  A(0)  is  stiU  given  by  (4.6.2),  so  aU  that  needs  to  be  done  is  to  replace  0^  by  0^  in  the 
preceding  analysis.  This  is  an  example  where  our  heuristic  definition  of  pc  leads  to  the  correct 


result.  Assume  n\  =  n2  -  n.  The  whole  of  0^  is  within  the  hatched  region  of  Figure  4.6.1. 
Reasoning  as  before  we  obtain,  for  0  <  £  <  1: 

A(7r,  tt)  <  A(^)  <  A(27r/n,  7r/2),  X{ir,  t:/2)  <  \{9)  <  A(7r/2, 2i:/n)  (4.6.6) 

Hence  po  =  max{|l  -  2u\,  |1  -  ea;(l  +  27rV«^)/(l  +  e)|,  so  that  pD  =  p-VO{n-‘^),  and  again 
we  conclude  that  point  Jacobi  is  not  a  robust  smoother  for  the  anisotropic  diffusion  equation. 


Line  Jacobi 

We  start  again  with  some  analytical  considerations.  Damped  vertical  line  Jacobi  iteration 
applied  to  the  discretized  anisotropic  diffusion  equation  (4.5.6)  with  c  =  1,  s  =  0  corresponds 
to  the  splitting 

r 

[M]=a;-i  0  2  +  2£  0  (4.6.7) 

L 

The  amplification  factor  is  given  by 

X[9)  —  oje  cos  9i/{l  +  e  —  cos  ^2)  +  1  —  w  (4.6.8) 

both  for  the  exponential  and  the  sine  Fourier  series.  We  note  immediately  that  |A(7r,0)|  =  1 
if  w  =  1,  so  that  for  w  =  1  this  seems  to  be  a  bad  smoother.  This  is  surprising,  because  as 
£  I  0  the  method  becomes  an  exact  solver.  This  apparent  contradiction  is  resolved  by  taking 
boundary  conditions  into  account.  In  Example  4.6.1  it  is  shown  that 

pD  =  \X{ir,(p)\  =  e/{l  +  e  -  cos  <p)  for  w  =  l  (4.6.9) 

where  </?  =  27r/n.  As  n  — +  00  we  have 

Pd  ~  (1  +  2ir‘^h^/e)~^  (4.6.10) 


so  that  indeed  lim  pD  =  0.  Better  smoothing  performance  may  be  obtained  by  varying  w.  In 

ej.0 

Example  4.6.1  it  is  shown  that  p  is  minimized  by 


2  +  2£ 
3  "t"  2£ 


(4.6.11) 


Note  that  for  0  <  £  <  1  we  have  2/3  <  w  <  4/5,  so  that  the  optimum  value  of  uj  is  only 
weakly  dependent  on  £.  We  also  find  that  for  u>  in  this  range  the  smoothing  factor  depends 
only  weakly  on  w.  We  will  see  shortly  that  fortunately  this  seems  to  be  true  for  more  general 
problems  also. 

With  oj  according  to  (4.6.11)  we  have 


p  —  (1  +  2£)/(1  +  3£) 


(4.6.12) 
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Choosing  w  =  0.7  we  obtain 


/!)  =  max{l  —  0.7/(l  +  e),0.6}  (4.6.13) 

which  shows  that  we  have  a  good  smoother  for  all  0  <  £  <  1,  with  an  £-independent  oj. 

Example  4.6.1.  Derivation  of  (4.6.9)  and  (4.6.11).  Note  that  A(0)  is  real,  and  that  we  need 
to  consider  only  Oa  >  0.  It  is  found  that  dXfdOi  =  0  only  for  0i  =  0,7r.  Starting  with  pp, 
we  see  that  max{|A(0)|  :  0  G  0^}  is  attained  on  the  boundary  of  0^.  Assume  n\  =  =  n, 

and  define  =  27r/n.  It  is  easily  see  that  maix{|A(0)|  :  $  G  0^}  wiU  be  either  |A(<p,7r/2)|  or 
|A(jr,ip)|.  If  w  =  1  it  is  |A(7r,(p)|,  which  gives  us  (4.6.9).  We  will  determine  the  optimum  value 
of  w  not  for  pd  but  for  p.  It  is  sufficient  to  look  for  the  maximum  of  |A(0)|  on  the  boundary 
of  0r.  It  is  easily  seen  that 

p  =  max{|A(0,7r/2)|,  |A(7r,0)|}  =  max{l  -  w/(H- e),  |1  -  2a;|} 

which  shows  that  we  must  take  0  <  w  <  1.  We  find  that  the  optimal  w  is  given  by  (4.6.11). 
Note  that  in  this  case  we  have  p  =  p. 

Equation  (4.5.5),  for  which  the  proceeding  analysis  was  done,  corresponds  to  /?  =  0  in  (4.5.3). 
For  ^  —  ir/2  damped  vertical  line  Jacobi  does  not  work,  but  damped  horizontal  line  Jacobi 
should  be  used.  The  general  case  may  be  handled  by  alternating  Jacobi:  vertical  line  followed 
by  horizontal  line  Jacobi.  Each  step  is  damped  separately  with  a  fixed  problem-independent 
value  of  w.  After  some  experimentation  lo  —  0.7  was  found  to  be  suitable;  (cf.  (4.6.12)  and 
(4.6.13).  Table  4.6.1  presents  results.  Here  and  in  the  remainder  of  this  chapter  we  take 
m  =  n2  =  n,  and  /?  is  sampled  with  intervals  of  15°,  unless  stated  otherwise.  The  worst  case 
found  is  included  in  the  tables  that  foUow. 

Increasing  n,  or  finer  sampling  of  (3  around  45°  or  0°,  does  not  result  in  larger  values 
of  p  and  pd  than  those  listed  in  Table  4.6.1.  It  may  be  concluded  that  damped  alternating 
Jacobi  with  a  fixed  damping  parameter  of  w  =  0.7  is  an  efficient  and  robust  smoother  for  the 
rotated  anisotropic  diffusion  equation,  provided  the  mixed  derivative  is  discretized  according 
to  (4.5.8).  Note  the  good  vectorization  and  parallelization  potential  of  this  method. 
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£ 

(4.5.6) 

(4.5.8) 

Pi  PD 

(i 

Pi  PD 

/5 

1 

0.28 

any 

0.28 

any 

10-^ 

0.63 

45° 

0.38 

45° 

10~2 

0.95 

45" 

0.44 

45° 

10"^ 

1.00 

45® 

0.45 

45° 

10-® 

1.00 

45“ 

0.45 

45° 

10-® 

1.00 

45° 

0.45 

45° 

Table  4.6.1:  Fourier  smoothing  factors  p,pD  for  the  rotated  anisotropic  diffusion  equation 
(4.5.3)  discretized  according  to  (4.5.6)  or  (4.5.8);  damped  alternating  Jacobi  smoothing; 
u)  =  0.7;  n  =  64. 

Convection- diffusion  equation 
Point  Jacobi 

For  the  convection-diffusion  equation  discretized  with  stencil  (4.5.11)  the  amplification  factor 
of  damped  point  Jacobi  is  given  by 

A(^)  =  u;(2  cos  61  +  2  cos  O2  +  Pie*'''  +  Pie"*®2)/(4  +  Pi  +  P2)  +  1  -  w  (4.6.14) 

where  Pi  =  ch/e,  P2  =  sh/s.  Consider  the  special  case:  Pi  =0,  P2  =  4/6.  Then 

A(7r,0)  =  1 -u)  +  w/(l  +  ^)  (4.6.15) 

so  that  |A(7r,0)|  ^  1  as  ^  i  0,  for  all  w,  hence  there  is  no  value  of  w  for  which  this  smoother 
is  robust  for  the  convection- diffusion  equation. 

Line  Jacobi 

Let  us  apply  the  line  Jacobi  variant  which  was  found  to  be  robust  for  the  rotated  anisotropic 
diffusion  equation,  namely  damped  alternating  Jacobi  with  u  =  0.7,  to  the  convection- 
diffusion  test  problem.  Results  are  presented  in  Table  4.6.2. 

Finer  sampling  of  /3  around  /?  =  0°  and  increasing  n  does  not  result  in  significant  changes. 
Numerical  experiments  show  u  =  0.7  to  be  a  good  value.  It  may  be  concluded  that  damped 
alternating  Jacobi  with  a  fixed  damping  parameter  (for  example,  a>  =  0.7)  is  a  robust  and 
efficient  smoother  for  the  convection- diffusion  test  problem.  The  same  was  just  found  to  be 
true  for  the  rotated  anisotropic  diffusion  test  problem.  The  method  vectorizes  and  parallelizes 
easily,  so  that  aU  in  aU  is  an  attractive  smoother. 
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e 

p 

PD 

/? 

1 

0.28 

0" 

0.28 

0“ 

10-1 

0.28 

0° 

0.29 

0° 

10-2 

0.29 

0" 

0.29 

0° 

10-3 

0.29 

0° 

0.29 

0° 

10-® 

0.40 

0° 

0.30 

0° 

Table  4.6.2:  Fourier  smoothing  factors  p^pD  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  damped  alternating  line  Jacobi  smoothing;  u  =  0.7;  n  =  64. 

Exercise  4,6.1  Assume  semi-coarsening  as  discussed  in  Section  4.4:  hi  =  hi ^  /i2  =  /i2/2. 
Show  that  damped  point  Jacobi  is  a  good  smoother  for  equation  (4.5.5)  with  0  <  e  <  1. 

Exercise  4.6.2  Show  that  lim  =  1  for  alternating  Jacobi  with  damping  parameter  uj  =  1 

eiO 

applied  to  the  convection- diffusion  test  problem. 


4.7  Gauss-Seidel  sraoothing 

Anisotropic  diffusion  equation 
Point  Gauss-Seidel 

Forward  point  Gauss-Seidel  iteration  applied  to  (4.5.3)  with  c  =  1,  5  =  0  corresponds  to  the 
splitting 

r  0 

[M]  =  -s  2e +  2  0 

L 

The  amplification  factor  is  given  by 

X{0)  =  +2e  +  2-  (4.7.2) 

For  £  =  1  (Laplace’s  equation)  one  obtains 

p  =  |A(7r/2,  cos“^(4/5))|  =  1/2  (4.7.3) 

To  illustrate  the  technicalities  that  may  be  involved  in  determining  p  analytically,  we  give  the 
details  of  the  derivation  of  (4.7.3)  in  the  following  example. 

Example  4.7.1.  Smoothing  factor  of  forward  point  Gauss-Seidel  for  Laplace  equa¬ 
tion.  We  can  write 

|A(d)P  =  (1  +  cos /l)/(9  —  8cos  ^cos  ^  +  cos  /?)  (4-7.4) 


1 

,  [iV]  =  0  0  £  (4.7.1) 

0 


45 


with  a  =  “  ^2-  Because  of  symmetry  only  a,  >0  has  to  be  considered.  We 

have 

d\Xi0)\yda  =  O  for  sin(a/2)  cos(^/2)  =  0  (4.7.5) 

This  gives  a  =  0  or  a  =  27r  or  /?  =  tt.  For  ^  =  tt  we  have  a  minimum:  jAp  =  0.  With 
a  =  0  we  have  |A(0)p  =  cos^(/3/2)/(2  -  cos(/3/2))^,  which  reaches  a  maximum  for  /3  =  27r, 
i.e.  at  the  boundary  of  0^.  With  a  —  2w  v/e  are  also  on  the  boundary  of  ©r-  Hence,  the 
maximum  of  |A(0)|  is  reached  on  the  boundary  of  0^.  We  have  |A(7r/2, 02)\^  =  (1+sin  02)/(9  + 
sin02  -  4008^2)5  of  which  the  02  derivative  equals  0  of  8  A  cos  ^2  -  4  sin  ^2  -  4  =  0,  hence 
$2  =  — 5r/2,  which  gives  a  minimum,  or  02  =  ±cos  (4/5).  The  largest  maximum  is  ob¬ 
tained  for  02  -  cos"^  (4/5).  The  extrema  of  |A(7r,02|  are  studied  in  similar  fashion.  Since 
A(^i,^2)  =  A(02)^i)  there  is  not  need  to  study  |A(0i,7r/2)|  and  |A(0i,7r)|.  Equation  (4.7.3) 
follows. 

We  will  not  determine  p  analytically  for  £  1,  because  this  is  very  cumbersome.  To  do  this 

numerically  is  easy,  of  course.  Note  that  lini  A(7r,0)  =  1,  Ji^A(?r,0)  =  —1,  so  that  forward 
point  Gauss-Seidel  is  not  a  robust  smoother  for  the  anisotropic  dilfusion  equation,  if  standard 
coarsening  is  used.  See  also  Exercise  4.7.1. 

With  semi- coarsening  in  the  X2  direction  we  obtain  in  Example  4.7.2:  p  <  {(1  -I-  £)/(5  -l- 
£:)}1/2,  which  is  satisfactory  for  £  <  1.  For  £  >  1  one  should  use  semi- coarsening  in  the 
a;i-direction.  Since  in  practice  one  may  ave  £  <C  1  in  one  part  of  the  domain  and  £  ]>  1  in 
another,  semi- coarsening  gives  a  robust  method  with  this  smoother  only  if  the  direction  of 
semi- coarsening  is  varied  in  the  domain,  which  results  in  more  complicated  code  than  standard 
multigrid. 

Example  4.7.2.  Influence  of  semi-coarsening.  We  will  show 

P<[(1-K£)/(5  +  £)]'/2  (4.7.6) 

for  the  smoother  defined  by  94.7.1)  with  semi-coarsening  in  the  X2  direction.  From  (4.7.2) 
it  follows  that  one  may  write  |A(0)|“^  =  1  -fi  (2  -f  2e)n{0)  with  /x(0)  =  (2  -f  2£  —  2e  cos  0\  — 
2  cos  02)/[l  -H  +  2£  cos  {0i  -  ^2]-  In  this  case,  ©^  is  given  in  Figure  4.4.2.  On  ©r  we  have 

m(^)  >  (2  -4-  2£  —  2£  cos  01-2  cos  02)/{l  +  >  2/{l  +  . 

Hence  |A(0)|  >  [1  -1-  4/(1  +  £)]■^/^  and  (4.7.6)  follows. 

For  backward  Gauss-Seidel  the  amplification  factor  is  A(-0),  with  A(0)  given  by  (4.7.2),  so 
that  the  amplification  factor  of  symmetric  Gauss-Seidel  is  given  by  A(— 0)A(0).  From  (4.7.2) 
it  follows  that  |A(0)|  =  |A(-0)|,  so  that  the  smoothing  factor  is  the  square  of  the  smoothing 
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factor  for  forward  point  Gauss-Seidel,  hence,  symmetric  Gauss-Seidel  is  also  not  robust  for 
the  anisotropic  diffusion  equation.  Also,  point  Gauss-Seidel- Jacobi  (Section  3.3)  does  not 
work  for  this  test  problem. 

The  general  rule  is:  points  that  are  strongly  coupled  must  be  updated  simultaneously.  Here 
we  mean  by  strongly  coupled  points:  points  with  large  coefficients  (absolute)  in  [A].  For 
example,  in  the  case  of  Equation  (4.5.5)  with  £  <  1  points  on  the  same  vertical  line  are 
strongly  coupled.  Updating  these  points  simultaneously  leads  to  the  use  of  line  Gauss-Seidel. 

Line  Gauss-Seidel 

Forward  vertical  line  Gauss-Seidel  iteration  applied  to  the  anisotropic  diffusion  equation 
(4.5.5)  corresponds  to  the  splitting 

[M]  =  -£ 

The  amplification  factor  is  given  by 

X{e)  =  /(2£  +  2  -  2  cos  02  -  ) 

and  we  find  Example  4.7.3,  which  follows  shortly: 

p  =  max{5-^/2^  (2/£  +  1)"^}  (4.7.9) 

Hence,  limp  =  5"^/^.  This  is  surprising,  because  for  £  =  0  we  have,  with  Dirichlet  bound- 

ej.0 

ary  conditions,  uncoupled  non-singular  tridiagonal  systems  along  vertical  lines,  so  that  the 
smoother  is  an  exact  solver,  just  as  in  the  case  of  line  Jacobi  smoothing,  discussed  before.  The 
behaviour  of  this  smoother  in  practice  is  better  predicted  by  taking  the  influence  of  Dirichlet 
boundary  conditions  into  account.  We  find  in  Example  4.7.3  below: 

£  <  (1  -f  \/5)/2  :  pD  =  e[e^  4-  (2£  -I-  2  -  2  cos 
£  >  (1  -f  -\/5)/2  :  pD  =  +  (2£  -I-  2)(2£  +  2  -  2£  cos  ^ 

with  (p  =  2'irh,  h  =  Ijn,  assuming  for  simplicity  n^  =  n2  =  n.  For  £  <  (1  -f  '\/5)/2  and  h  [  0 
this  can  be  approximated  by 

PD  ^  [1  -I-  (2  -I-  (4.7.11) 

and  we  see  that  the  behaviour  of  pD  as  £  J,  0,  /i  i  0  depends  on  yP' fe  =  Air^h? /e.  For  ].  0 
with  £  fixed  we  have  po  —  P  and  recover  (4.7.9);  for  £  i  0  with  h  fixed  we  obtain  pD  =  0.  To 


(4.7.8) 


47 


give  a  practical  example,  with  h  =  1/128  and  £  =  10  ®  we  have  pD  —  0.0004. 

Example  4.7.3,  Derivation  of  (4.7.9)  and  (4.7.10).  It  is  convenient  to  work  with 
|A(d)|-2.  We  have 

|A(0)|-2  =  [(2£  +  2  -  £  cos  01  -  2  cos  62)^  +  sin^  0i]/£^  . 

Min  {|A(0)|”^  :  0  €  0^}  is  determined  as  follows.  We  need  to  consider  only  Oa  >  0.  It  is 
found  that  9|A(0)|"^/902  =  0  for  02  =  0  for  02  =  0,7r  only.  Hence  the  minimum  is  attained 
on  the  boundary  of  0^.  Choose  for  simplicity  n\  =  n2  —  n,  and  define  <p  =  2Tr f  n.  It  is  easily 
seen  that  in  0^  we  have 

|A(0i,^)r  >  |A(7r/2,y.)|-2,  |A(¥>,02)|-2  >  |A(vp,7r/2)|-^ 

|A(7r, 02)1-2  >  |A(7r,vp)|-2,  |A(0i,x/2)|-2  >  |A(v.,7r/2)|2  , 

|A(7r/2,02)|-2  >  |A(7r/2,v?)r2 ,  |A(0i,7r)|-2  >  |A(v:j,7r)|-2 

For  £  <  (1  +  a/5)/  the  minimum  is  |A(jr/2,<,p)|-2;  for  £  >  (1  +  V5)/2,  the  minimum  is 
|A(^,7r/2)|“2.  This  gives  us  (4.7.10).  We  continue  with  (4.7.9).  The  behaviour  of  |A(0)|  on 
the  boundary  of  Qr  is  found  simply  by  letting  (p  ^  0_in  the  preceding  results.  Now  there  is 
also  the  possibility  of  a  minimum  in  the  interior  of  0^,  because  02  =  0  is  allowed,  but  this 
leads  to  the  minimum  in  (7r/2,0),  which  is  on  the  boundary,  and  (4.7.9)  follows. 

Equations  (4.7.9)  and  (4.7.10)  predict  bad  smoothing  when  £  >  1.  Of  course,  for  £  >  1 
horizontal  line  Gauss-Seidel  should  be  used.  A  good  smoother  for  arbitrary  £  is  alternating 
line  Gauss-Seidel.  For  analytical  results,  see  [141].  Table  4.7.1  presents  numerical  values  of 
p  and  pD  for  a  number  of  cases.  We  take  n\  —  n2  =  n  =  64,  ^  =  A:7r/12,  A:  =  0, 1, 2,  ...,23  in 
(4.5.3),  and  present  results  only  for  a  value  of  /?  for  which  the  largest  p  or  pu  is  obtained.  In 
the  cases  listed,  p  =  pn-  Alternating  line  Gauss-Seidel  is  found  to  be  a  robust  smoother  for 
the  rotated  anisotropic  diffusion  equation  if  the  mixed  derivative  is  discretized  according  to 
(4.5.8),  but  not  if  (4.5.6)  is  used.  Using  under-relaxation  does  not  change  this  conclusion. 


Convection-diffusion  equation 
Point  Gauss-Seidel 

Forward  point  Gauss-Seidel  iteration  applied  to  the  central  discretization  of  the  convection- 
diffusion  equation  (4.5.10)  is  not  a  good  smoother,  see  [141]. 

For  the  upwind  discretization  (4.5.11)  one  obtains,  assuming  c  >  0,  s  >  0: 


A(0)  = 


c»^4l  +  (|Pi|  -  Pi)/2]  -b  6»^^[1  -h  (IP2I  -  P2)/2] 

4  +  jPil  +  IP2I  -  e-‘^'[l  +  (Pi  +  |Pi|)/2]  -  ei^l  -H  (P2  +  |P2|)/2] 


(4.7.12) 


with  Pi  =  ch/£,  P2  =  sh/e  the  mesh-Peclet  numbers  (for  simplicity  we  assume  ui  =  112). 

For  Pi  >  0,  P2  <  0  we  have  |A(0,7r)|  =  |P2/(4-  P2)|,  which  tends  to  1  as  IP2I  -»■  00.  To  avoid 
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£ 

(4.5.6) 

(4.5.8) 

P^PD 

0 

P'.PD 

0 

1 

0.15 

any 

0.15 

any 

10-^ 

0.38 

105° 

0.37 

15° 

10-2 

0.86 

45° 

0.54 

15° 

10-° 

0.98 

45° 

0.58 

15° 

10-° 

1.00 

45° 

0.59 

15° 

Table  4.7.1:  Fourier  smoothing  factors  p,pD  for  the  rotated  anisotropic  diffusion  equation 
(4.5.3)  discretized  according  to  (4.5.6)  and  (4.5.8);  alternating  line  Gauss-Seidel  smoothing; 
n  =  64. 


this  the  order  in  which  the  grid  points  are  visited  has  to  be  reversed:  backward  Gauss-Seidel. 
Symmetric  point  Gauss-Seidel  (forward  followed  by  backward)  therefore  is  more  promising 
for  the  convection-diffusion  equation.  Table  4.7.2  gives  some  numerical  results  for  p,  for 
m  =  n2  =  64.  We  give  results  for  a  value  of  /?  in  the  set  {/?  =  fc7r/12  :  /;  =  0, 1, 2,  ...,23}  for 
which  the  largest  p  and  pjy  are  obtained. 


Although  this  is  not  obvious  from  Table  4.7.2,  the  type  of  boundary  condition  may  make 
a  large  difference.  For  instance,  for  /?  =  0  and  £  J,  0  one  finds  numerically  for  forward  point 
Gauss-Seidel:  p  =  |A(0,7r/2)|  =  l/\/5,  whereas  limpn  =  0,  which  is  more  realistic,  since  as 

eiO 

e  I  0  the  smoother  becomes  an  exact  solver.  The  dilFerence  between  p  and  pjo  is  explained 
by  noting  that  for  9^  =  tp  —  2'Kh  and  <  1  we  have  |A((/>,7r/2)p  =  1/(5  +  y  +  \y^)  with 
y  =  2'Kh?e. 

For  £  <C  1  and  =  105°  Table  4.7.2  shows  rather  large  smoothing  factors.  In  fact,  symmetric 
point  Gauss-Seidel  smoothing  is  not  robust  for  this  test  problem.  This  can  be  seen  as  follows. 
If  Pi  <  0,  P2  >  0  we  find 


A(|.0) 


I-  _ \  +  P2-i 

S  —  Pi  -\-  i  3  —  Pi  P2  —  f(l  ~  Pi) 


(4.7.13) 


Choosing  Pi  =  —aP2  one  obtains,  assuming  P2  >  1,  q:P2  >  1: 

|A(|,0)^(l  +  a)-2  (4.7.14) 

SO  that  p  may  get  close  to  1  if  a  is  small.  The  remedy  is  to  include  more  sweep  directions. 
Four- direction  point  Gauss-Seidel  (consisting  of  four  successive  sweeps  with  four  orderings: 
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e 

p 

PD 

1 

0.25 

0.25 

0 

10-1 

0.27 

0.25 

0 

10-2 

0.45 

0.28 

105" 

10-3 

0.71 

0.50 

105" 

1 

o 

0.77 

0.71 

105" 

Table  4.7.2:  Fourier  smoothing  factors  p,  po  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  symmetric  point  Gauss-Seidel  smoothing. 

the  forward  and  backward  orderings  of  Figure  3.3.1,  the  forward  vertical  line  ordering  of  Fig¬ 
ure  3.3.1,  and  this  last  ordering  reversed)  is  robust  for  this  test  problem,  as  illustrated  by 
Table  4.7.3. 

As  before,  we  have  taken  jS  =  A;7r/12,  k  =  0, 1,2,  ...,23;  Table  4.7.3  gives  results  only 
for  a  value  of  /?  for  which  the  largest  p  and  po  are  obtained.  Clearly,  four- direction  point 
Gauss-Seidel  is  an  excellent  smoother  for  the  convection-diffusion  equation.  It  is  found  that 
p  and  pD  change  little  when  n  is  increased  further. 

Another  useful  smoother  for  this  test  problem  is  four- direction  point  Gauss-Seidel- Jacobi, 


e 

p 

PD 

a 

1 

0.040 

0.040 

0" 

10-1 

0.043 

0.042 

0" 

10-2 

0.069 

0.068 

0" 

10-3 

0.16 

0.12 

0" 

10-® 

0.20 

0.0015 

15" 

Table  4.7.3:  Fourier  smoothing  factors  p,  pD  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  four-direction  point  Gauss-Seidel  smoothing;  n  =  64. 


defined  in  Section  3.3.  As  an  example,  we  give  for  discretization  (4.5.11)  the  splitting  for  the 
forward  step: 


[M]  =  e 


0 

-14  0 
0 

[N]  =  [M]-[A] 


+  t[-c-|c|  2|c|  0] 


(4.7.15) 
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The  amplification  factor  is  easily  derived.  Table  4.7.4  gives  residts,  sampling  ^  as  before.  The 
results  are  satisfactory,  but  there  seems  to  be  a  degradation  of  smoothing  performance  in  the 
vicinity  of  /?  =  0°  (and  similarly  near  (5  =  kTT/2,  =  1,2,3).  Finer  sampling  with  intervals 
of  I**  gives  the  results  of  Table  4.7.5. 

This  smoother  is  clearly  usable,  but  it  is  found  that  damping  improves  performance  still 
further.  Numerical  experiments  show  that  w  =  0.8  is  a  good  value;  each  step  is  damped 
separately.  Results  are  given  in  Table  4.7.6.  Clearly,  this  is  an  efficient  and  robust  smoother 
for  the  convection-diffusion  equation,  with  u)  fixed  at  w  =  0.8.  Choosing  oj  —  \  gives  a  little 
improvement  for  £://i  >  0.1,  but  in  practice  a  fixed  value  of  u  is  to  be  preferred,  of  course. 


e 

p 

PD 

/? 

1 

0.130 

0.130 

0“ 

10-* 

0.130 

0.130 

45° 

10-2 

0.127 

0.127 

45° 

10"^ 

0.247 

0.242 

15° 

10-5 

0.509 

0.494 

15° 

10-® 

0.514 

0.499 

15° 

Table  4.7.4:  Fourier  smoothing  factors  p,pD  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  four-direction  point  Gauss- Seidel- Jacobi  smoothing;  n  =  64. 


e 

n 

P 

/? 

PD 

/? 

10-5 

64 

0.947 

1° 

0.562 

8° 

10-5 

128 

0.949 

1° 

0.680 

5° 

Table  4.7.5:  Fourier  smoothing  factors  pipD  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  four-direction  point  Gauss- Seidel- Jacobi  smoothing. 


Line  Gauss- Seidel 

For  forward  vertical  line  Gauss-Seidel  we  have 

A(0)  =  e’'^‘[l  -  Pi  -  |Pi|)/2]/{4-|-  iPil  -f  IP2I  -  +  {Pi  +  |f'il)/2] 

+  (l^2|  -  P2)/2]  -  +  {P2  +  I^2|)/2]}  (4.7.16) 
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e 

P-,PD 

/3 

1.0 

0.214 

0" 

10-^ 

0.214 

0° 

10-2 

0.214 

45° 

10-3 

0.217 

45° 

10"® 

0.218 

45° 

10-® 

0.218 

45° 

Table  4.7.6:  Fourier  smoothing  factors,  p,  pu  for  the  convection- diffusion  equation  discretized 
according  to  (4.5.11);  four-direction  point  Gauss-Seidel-Jacobi  smoothing  with  damping  pa¬ 
rameter  w  =  0.8;  n  —  64. 

For  Pi  <  0,  P2  >  0  this  gives  |A(7r,0)|  =  (1  -  A)/(3  -  Pi),  which  tends  to  1  as  |Pi|  ->  00, 
so  that  this  smoother  is  not  robust.  Alternating  line  Gauss-Seidel  is  also  not  robust  for  this 
test  problem.  If  P2  <  0,  Pi  =  aP2,  a  >  0  and  IP2I  >  1,  |aP2|  '>  1  then 

A(0, 7r/2)  ^  fa/(l  +  a-i)  (4.7.17) 

so  that  |A(0,5r/2)|  =  Q:/[(l  -I-  a)^  -I- 1]^^^,  which  tends  to  1  if  a  >  1.  Symmetric  (forward  fol¬ 
lowed  by  backward)  horizontal  and  vertical  line  Gauss-Seidel  are  robust  for  this  test  problem. 
Table  4.7.7  presents  some  results.  Again,  n  =  64  and  yS  =  kir/2,  k  =  0, 1,2,  ...,23;  Table  4.7.7 


e 

p 

/3 

PD 

1 

0.20 

0.20 

10"^ 

0.20 

0 

0 

0.20 

90° 

10-2 

0.20 

90° 

0.20 

0 

0 

0 

10-^ 

0.30 

0° 

0.26 

0° 

10-® 

0.33 

0° 

0.0019 

75° 

Table  4.7.7:  Fourier  smoothing  factors,  p,pD  for  the  convection- diffusion  equation  discretized 
according  to  (4.5.11);  symmetric  vertical  line  Gauss-Seidel  smoothing;  n  =  64. 

gives  results  only  for  the  worst  case  in  /?. 

We  will  not  analyse  these  results  further.  NumericaDy  we  find  that  for  /?  =  0  and  £  <C  1  that 
p  =  (A(0,7r/2)  =  (1  -I-  Pi)/(9  -I-  3Pi)  =  1/3.  As  e  i  0,  pD  depends  on  the  value  of  ne.  It  is 
clear  that  we  have  a  robust  smoother. 
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We  may  conclude  that  alternating  symmetric  line  Gauss-Seidel  is  robust  for  both  test 
problems,  provided  the  mixed  derivative  is  discretized  according  to  (4.5.8).  A  disadvantage 
of  this  smoother  is  that  it  does  not  lend  itself  to  vectorized  or  paraDel  computing. 

The  Jacobi-type  methods  discussed  earlier  and  Gauss-Seidel  with  pattern  orderings  (white- 
black,  zebra)  are  more  favourable  in  this  respect.  Fourier  smoothing  analysis  of  Gauss-Seidel 
with  pattern  orderings  is  more  involved,  and  is  postponed  to  a  later  section. 

Exercise  4.7.1  Show  that  damped  point  Gauss-Seidel  is  not  robust  for  the  rotated  anisotropic 
diffusion  equation  with  c  =  1,  5  =  0,  with  standard  coarsening. 

Exercise  4.7.2  An  Exercise  4.7.1,  but  for  Gauss-Seidel- Jacobi  method. 


4.8  Incomplete  point  LU  smoothing 

For  Fourier  analysis  it  is  necessary  that  [M]  and  [N]  are  constant,  i.e.  do  not  depend  on  the 
location  in  the  grid.  For  the  methods  just  discussed  this  is  the  case  if  [A]  is  constant.  For 
incomplete  factorization  smoothing  methods  this  is  not,  however,  sufficient.  Near  the  bound¬ 
aries  of  the  domain  [M]  (and  hence  [N]  =  [M]  -  [A])  varies,  usually  tending  rapidly  to  a  con¬ 
stant  stencil  away  from  the  boundaries.  Nevertheless,  useful  predictions  about  the  smoothing 
performance  of  incomplete  factorization  smoothing  can  be  made  by  means  of  Fourier  analysis. 
How  this  can  be  done  is  best  illustrated  by  means  of  an  example. 


Five-point  ILU 

This  incomplete  factorization  has  been  defined  in  Section  4.4,  in  standard  matrix  notation. 
In  Section  4.4  A  was  assumed  to  have  a  five-point  stencil.  With  application  to  test  problem 
(4.5.6)  in  mind,  A  is  assumed  to  have  the  seven-point  given  below.  In  stencil  notation  we 


where  i  =  (^1,^2).  We  will  study  the  unmodified  version.  For  Si  we  have  the  recursion  (3.4.12) 
with  (7  =  0: 

Si  =  d-  ag/Si-e2  -  (4.8.2) 


where  ei  =  (1,0),  62  =  (0, 1).  Terms  involving  negative  values  of  a  =  1  or  2,  are  to  be 
repaced  by  zero.  We  will  show  the  following  Lemma. 
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Lemma  4.8.1.  If 

a  +  c  +  d  +  q  +  g>0,  a,c,q,g<0,  d>0  (4.8.3) 

then 

Urn  6i  =  S  =  d/2  +  [d'^/A  -  {ag  +  cq)Y^^  (4.8.4) 

n  ,«2— 

The  proof  is  given  in  [141].  Note  that  (4.8.3)  is  satisfied  if  6  =  /  =  0  and  A  is  a  K-maXTix 
(Section  3.2).  Obviously,  6  is  real,  and  6  >  d.  The  rate  at  which  the  limit  is  reached  in  (4.8.4) 
is  studied  in  [141].  A  sufficient  number  of  mesh  points  away  from  the  boundaries  of  the  grid 
G  we  have  approximately  Si  =  S,  and  replacing  Si  by  S  we  obtain  for  [AT]  =  [£r][£>~^][17]: 

cg/S  g 

[AT]  =  c  d  q  (4.8.5) 

a  aq/S 

and  standard  Fourier  smoothing  analysis  can  be  applied.  Equation  (4.8.5)  is  derived  eas¬ 
ily  by  nothing  that  in  stencil  notation  (ABu)i  =  +  j,k)ui+j^.k,  so  that 

A{i,j)B{i+j,  k)  gives  a  contribution  to  C{iJ+k),  where  C  =  AB]  by  summing  all  contribu¬ 
tions  one  obtains  C(i,l).  An  explicit  expression  for  C{i,l)  is  C{i,l)  =  T,jA{i,j)B{i+j,l-j), 
since  one  can  write  (Cu)i  =  YiiT,jA{i,j)B{i  +  j,l  —  j)ui^i. 

Smoothing  factor  of  five-point  ILU 

The  modified  version  of  incomplete  factorization  will  be  studied.  As  remarked  in  [145]  modi¬ 
fication  is  better  than  damping,  because  if  the  error  matrix  N  is  small  with  cr  =  0  it  will  also 
be  small  with  £t  7^  0.  The  optimum  a  depends  on  the  problem.  A  fixed  cr  for  all  problems  is  to 
be  preferred.  From  the  analysis  and  experiments  in  [145]  and  [147]  and  our  own  experiments 
it  follows  that  a  =  0.5  is  a  good  choice  for  aU  point-factorizations  considered  here  and  aU 
problems.  Results  wiU  be  presented  with  a  =  0  and  cr  =  0.5.  The  modified  version  of  the 
recursion  (3.4.12)  for  Sk  is 

Sk  =  d  -  ag/Sk-j  -  cq/Sk-i  +  (T{\aq/Sk-i  -  6|  +  \cg/Si-i  -  /[}  (4.8.6) 

The  limiting  value  S  in  the  interior  of  the  domain,  far  from  the  boundaries,  satisfies  (4.8.6) 
with  the  subscripts  omitted,  and  is  easily  determined  numerically  by  the  following  recursion 

=  d-{aq  +  cq)/Sk  -f  (T{\aq/Sk  -  6]  -|-  \cg/Sk  -  f\}  (4.8.7) 

The  amplification  factor  is  given  by 

A(0)  =  {{aq/S  -  b)exp[i{6i  -  ^2)]  +  {eg/ S  —  f)exp[i{02  -  ^1)]  +  (^p)/ 
{aexp{-i92)  +  aqexp[i{9i  -  ^2)]/^  +  cexp{-i0i)  +  d  +  erp 
+qexp{i9i)  +  cgexp[i{92  -  9i)]/S  +  gexp{i92)}  (4.8.8) 
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where  p  =  \aq/6  —  6|  +  \cg/6  -  f\. 

Anisotropic  diffusion  equation 

For  the  (non-rotated  (5  =  0°)  anisotropic  diffusion  equation  with  discretization  (4.5.6)  we  have 
g  =  a  =  -1,  c  =  q  =  d  =  2  +  2e,  b  =  /  =  0,  and  we  obtain:  ^  =  1  +  £  +  [2£(1  +  cr)]^/^, 
and 


A(0)  =  [£COS(01  -  02)/^  +  <7’£/^]/ 

[!  +  £:  +  (Tsld  —  e  COS  —  cos  02  +  £  cos(0i  —  02)/0]  (4.8.9) 

We  will  study  a  few  special  cases.  For  £  =  1  and  cr  =  0  we  find  in  [141]: 

p=  |A(7r/2,-;r/3)|  =  (2\/3  +  >/6-l)"^  ~  0.2035  (4.8.10) 


The  case  e  =  1,  cr  0  is  analytically  less  tractable.  For  e  <C  1  we  find  in  [141]: 


0  <cr<  1/2:  p  ^  |A(7r,  0)  =  (1  -  ct)/(20  -  1  + cr) 
l/2<cr<l:  p  =  |A(7r/2)  =  cr/(cr  +  0) 


(4.8.11) 


0  <  cr  <  1/2  :  p£)  =  I A(7r,  t)|  =  (1  -  <t)/(20  -  1  +  cr  +  0rV2£)  a  8  1 91 

l/2<cr<l:  p£)  =  |A(7r/2,r)|  =  (cT  +  r)((r  +  0  +  0r^/2e) 

where  r  =  27r/n2.  These  analytical  results  are  confirmed  by  Table  4.8.1.  For  example,  for 
e  =  10“^,  712  =  64  and  a  =  1/2  equation  (4.8.12)  gives  pD  —  0.090,  p  =  1/3.  Table  4.8.1 
includes  the  worst  case  for  (3  in  the  set  {(3  =  kir /12^  /?  =  0, 1,2,  ...,23).  Here  we  have  another 
example  showing  that  the  influence  of  the  type  of  the  boundary  conditions  on  smoothing 
analysis  may  be  important.  For  the  non-rotated  anisotropic  diifusion  equation  (/?  =  0^  or 
13  =  90°)  we  have  a  robust  smoother  both  for  a  =  0  and  cr  =  1/2,  provided  the  boundary 
conditions  are  of  Dirichlet  type  at  those  parts  of  the  boundary  that  are  perpendicular  to  the 
direction  of  strong  coupling.  When  f3  is  arbitrary,  five-point  ILU  is  not  a  robust  smoother 
with  (T  =  0  or  (7  =  1/2.  We  have  not  experimented  with  other  values  of  cr,  because,  as  it  will 
turn  out,  there  are  other  smoothers  that  are  robust,  with  a  fixed  choice  of  cr,  that  does  not 
depend  on  the  problem. 
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e 

a 

P 

(3  =  0°,  90° 

P 

/3  =  15° 

PD 

13  =  0°,  90° 

PD 

/i  =  15° 

1 

0 

0.20 

0.20 

0.20 

0.20 

10-^ 

0 

0.48 

1.48 

0.46 

1.44 

10-2 

0 

0.77 

7.84 

0.58 

6.90 

10"° 

0 

0.92 

13.0 

0.16 

10.8 

10-® 

0 

0.99 

13.9 

0.002 

11.5 

1 

0.5 

0.20 

0.20 

0.20 

0.20 

10-1 

0.5 

0.26 

0.78* 

0.26 

0.78* 

10-2 

0.5 

0.30 

1.06 

0.025 

1.01 

10-° 

0.5 

0.32 

1.25 

0.089 

1.18 

10-° 

0.5 

0.33 

1.27 

0.001 

1.20 

Table  4.8.1:  Fourier  smoothing  factors,  p,pD  for  the  rotated  anisotropic  diffusion  equation 
discretized  according  to  (4.5.6);  five-point  ILU  smoothing;  n  =  64.  In  the  cases  marked  with 
*,  /?  =  45° 


Convection- diffusion  equation 

Let  us  take  Pi  =  -aP2,  a  >  0,  P2  >  0,  where  Pi  =  ch/e,  P2  =  sh/e.  Then  we  have  for  the 
convection-diffusion  equation  discretized  according  to  (4.5.11):  a  =  — 1  —  P2,  b  =  f  =  0,  c  — 
— 1,  d  =  4  -f-  (1  -k  a)P2,  9  =  — 1  —  CLP21  9  =  After  some  manipulation  one  finds  that  if 
a  <  1,  P2  >  1,  0P2  >  1)  then  A(7r/2, 0)  ^  *  as  P2  ^  00.  This  is  accordance  with  Table  4.8.2. 
The  worst  case  obtained  when  /9  is  varied  according  to  ^  =  A:7r/12,  A:  =  0, 1, 2, ...,  23  is  listed. 
Clearly,  five-point  ILU  is  not  robust  for  the  convection-diffusion  equation,  at  least  for  tr  =  0 
and  (T  =  0.5 


Seven-point  ILU 

Seven-point  ILU  tends  to  be  more  efficient  and  robust  than  five-point  ILU.  Assume 


[A]  = 


/  9 
c  d  q 
a  b 


(4.8.13) 


The  seven-point  incomplete  factorization  A  =  LD  ^U  —  N  discussed  in  Section  4.4  is  defined 
in  stencil  notation  as  follows: 


- 1 

0 

0 

_ 1 

I 

0 

0 

1 _ _ 

Ci  V 

7«  0 

II 

0 

0 

II 

0  Si  Pi 

1 

_ 1 

1 

0 

0 

_ 

1 

0 

0 

(4.8.14) 
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s 

p 

PD 

/3 

P 

PD 

a  =  0 

a  = 

0.5 

1 

0.20 

0.20 

O'^ 

0.20 

0.20 

10-^ 

0.21 

0.21 

0° 

0.20 

0.20 

10-2 

0.24 

0.24 

120° 

0.24 

0.24 

10-3 

0.60 

0.60 

105° 

0.48 

0.48 

10-® 

0.77 

0.71 

105° 

0.59 

0.58 

Table  4.8.2:  Fourier  smoothing  factors  />,  po  for  the  convection- diffusion  equation  discretized 
according  to  (4.5.11);  five-point  ILU  smoothing;  n  =  64 


We  have,  taking  the  limit  i  ^  oo,  assuming  the  limit  exists  and  writing  lim,_^oo  cti  =  a  etc., 

OL  =  ctf  j3  =  b  Ufi! S,  ‘j  —  c  ®C/^  )  14  fi  1  ’ll 

p  ^  q-  I3g/S,  C  =  /  -  79/^,  V  =  9 

with  S  the  appropriate  root  of 

6  =  d-{ag  +  +  'fp)6  +  o-(|^/x/ +  ItC/^I)  (4.8.16) 

Numerical  evidence  indicates  that  the  limiting  6  resulting  as  i  ^  oo  is  the  same  as  that  for 
the  following  recursion. 


/3o  —  by  'yo  —  ^0  —  Co  f 

/3j+i  =b-  apj/Sj,  7j+i  =  c  -  aQ/Sj 
6j+i  ^  d-{ag  +  Pj+iQ  +  7j+iMi)/^j  +  +  l7j+iCi/^il) 

f^j+i  ~  9  ~  ^j+i9 1 Cj+i  ~  f  ~  7i-l-i5/^j 


(4.8.17) 


For  M  we  find  M  =  LD  =  A -\-  N,  with 

P2  0  0 

[IV]  =  0  P3 

0  0  0  Pi 

The  amplification  factor  is  given  by 


Pi  =  P2  =  7C/^, 

P3  =  £^(|pi|+ IP2I1) 


(4.8.18) 


K^)  =  {P3  +  Pi  exp[i(2^i  -  ^2)]  +  P2  exp[-f(20i  -  ^2)]}/ 

{aexp(-i02)  +  &exp[f(02  -  ^1)]  +  Pi  exp[i(20i  -  62)]  +  cexp(-i0i) 

+d  +  p3  -H  ?exp(t0i)  -I-  P2  exp[-i(20i  -  ^2)] 

-|-/exp[-t(^i  -  ^2)]  +  pexp(i02)}  (4.8.19) 
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Anisotropic  diffusion  equation 

For  the  anisotropic  diffusion  problem  discretized  according  to  (4.5.9)  we  have  symmetry: 
fi  =  j,  c  =  9  =  0',  f  =  b,  g  =  c,  so  that  (4.8.19)  becomes 

A(0)  =  [o-p  +  p  cos(2^i  -  $2]/ 

[a  cos  02 +  b  cos(0i  —  $2)  +  c  cos  9i  +  d/2  +  ap  +  p  cos(20i  —  62)]  (4.8.20) 

with  p  =  finlb. 

With  rotation  angle  /3  =  90°  and  £  <  1  we  find  in  [141]: 

0<a<l/2:  p  ~  |A(0, 7r)|  ~  2i+^p-p  ('4.8.21) 

l/2>a>l:  p~  |A(0,;r/2)|~ 

0<<r<l/2:  po~|A((p,7r)|~|((r-l  +  2(p2)/[rf2(2  +  <p2/2£)  +  a-l]|  (4322) 

1/2<<t<1:  pd  ~  iA(gj,7r/2)|~  |(tT  +  2<p)/[^2(i4.<^2/2£)  +  CT-2<p]| 

with  (p  =  27r/ni.  These  results  agree  approximately  with  Table  4.8.3.  For  example,  for 
e  =  10~^,  ni  =  64  Equation  (4.8.22)  gives  pD  0.152  for  <7  =  0,  and  po  ^  0.103  for  a  =  0.5. 
Table  4.8.3  includes  the  worst  case  for  /?  in  the  set  {/?  =  A:7r/12,  fc  =  0, 1, 2, ...,  23}.  Equations 
(4.8.21)  and  (4.8.22)  and  Table  4.8.3  show  that  the  boundary  conditions  may  have  an  impor¬ 
tant  influence.  For  rotation  angle  (i  =  Q  ox  j5  =  90*^,  seven-point  ILU  is  a  good  smoother  for 


p 

P 

PD 

PD 

e 

a 

/3  =  0" 

0 

0 

0^ 

11 

P,li 

/?  =  o° 

0 

0 

II 

PD,0 

1 

0 

0.13 

0.13 

0.13, 

any 

0.12 

0.12 

0.12, 

any 

10-^ 

0 

0.17 

0.27 

0.45, 

75° 

0.16 

0.27 

0.44, 

75° 

10-2 

0 

0.17 

0.61 

1.35, 

75" 

0.11 

0.45 

1.26, 

75° 

10-3 

0 

0.17 

0.84 

1.69, 

75“ 

0.02 

0.16 

1.55, 

75° 

10-® 

0 

0.17 

0.98 

1.74, 

75° 

lO-'^ 

0.002 

1.59, 

75° 

1 

0.5 

0.11 

0.11 

1.11, 

any 

0.11 

0.11 

0.11, 

any 

10“^ 

0.5 

0.089 

0.23 

0.50, 

60° 

0.087 

0.23 

0.50, 

60° 

10-2 

0.5 

0.091 

0.27 

0.77, 

0 

0 

0 

0.075 

0.25 

0.77, 

0 

0 

10-^ 

0.5 

0.091 

0.31 

0 

bo 

60° 

0.029 

0.097 

0.82, 

60° 

10-® 

0.5 

0.086 

0.33 

0.83, 

0 

0 

4  X  lO""* 

10-^ 

0.82, 

0 

0 

Table  4.8.3:  Fourier  smoothing  factors  p,  pu  for  the  rotated  anisotropic  diffusion  equation 
discretized  according  (4.5.6);  seven-point  ILU  smoothing;  n  =  64 

the  anisotropic  diffusion  equation.  With  a  =  0.5  we  have  a  robust  smoother;  finer  sampUng 
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of  I3  and  increasing  n  gives  results  indicating  that  p  and  pD  are  bounded  away  from  1.  For 
some  values  of  /3  this  smoother  is  not,  however,  very  effective.  One  might  try  other  values  of 
a  to  diminish  pi).  A  more  efficient  and  robust  ILU  type  smoother  will  be  introduced  shortly. 
In  [141]  it  is  shown  that  for  /?  =  45°  and  £  <C  1 

p  ~  max 

Hence,  the  optimal  value  of  a  for  this  case  is  (t  =  0.5,  for  which  p  ~  1/3. 


1(7-1 


\(T+l 


tr  +  1 


(4.8.23) 


Convection-diffusion  equation 

Table  4.8.4  gives  some  results  for  the  convection-diffusion  equation.  The  worst  case  for  /?  is 
the  set  {0  =  A:7r/12  :  k  =  0, 1,2,  ...,23}  is  listed.  It  is  found  numerically  that  p  <  1  and 
Pc  <  1  when  £  <  1,  except  for  0  close  to  0°  or  180°,  where  p  and  pD  are  found  to  be 
larger  than  for  other  values  of  0,  which  may  speU  trouble.  We,  therefore,  do  some  analysis. 
Numerically  it  is  found  that  for  £  ■<  1  and  js]  <C  1  we  have  p  ~  |A(0,7r/2)|,  both  for  <7  =  0 
and  <7  =  1/2.  We  proceed  to  determine  A(0,5r/2).  Assume  c  <  0,  s  >  0;  then  (4.5.11)  gives 
a  =  -£  -  hs,  6  =  0,  c  =  -£,  d  =  4£  -  ch  +  sh,  q  =  -e  +  he,  /  =  0,  p  =  -£.  Equations 
(4.8.15)  and  (4.8.16)  give,  assuming  £  -C  1  ,  |s|  C  1  and  keeping  only  leading  terms  in  e  and 
s,  /?  ~  (£  +  sh)ch/S,  7  ~  -£,  p  ~  ch,  C  —  0,  6  ~  (s  -  c)h,  p\  ~  (£  -|-  sh)c^/{$  -  c)^,  p2  =  0- 
Substitution  in  (4.8.19)  and  neglect  of  a  few  higher  order  terms  results  in 


A(0,7r/2)  ~ 


_  (<7  -  0(r  +  1)  _ 

(r  4-  2)(1  -  2  tan 0)  +  <7(1  +  r)  +  i(l  —  2r  tan  0) 


(4.8.24) 


where  r  =  sh/e,  so  that 


O’  =  0 

(T  = 

0.5 

e 

p 

PD 

0 

p 

PD 

0 

1 

0.13 

0.12 

90° 

0.11 

0.11 

0° 

10-1 

0.13 

0.13 

90° 

0.12 

0.12 

0° 

10-2 

0.16 

0.16 

0° 

0.17 

0.17 

165° 

10-° 

0.44 

0.43 

165° 

0.37 

0.37 

165° 

10-® 

0.58 

0.54 

165° 

0.47 

0.47 

165° 

Table  4.8.4:  Fourier  smoothing  factors  p,pc  for  the  convection- diffusion  equation  discretized 
according  to  (4.5.11);  seven-point  ILU  smoothing;  n  =  64 

p^  ~  (r  -t- 1)^((7^  +  l)/{[(r-t-  2)(1  -  2  tan  0)  +  (t{1  +  r)]^  +  (1  -  2rtan  0^}  (4.8.25) 
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hence, 


^2  <  (^2  +  l)/(a  +  1)2  (4.8.26) 

Choosing  cr  =  1/2,  (4.8.26)  gives  p  <  \y/b  ~  0.75,  so  that  the  smoother  is  robust.  With 
CT  =  0,  inequality  (4.8.26)  does  not  keep  p  away  from  1.  Equation  (4.8.25)  gives,  for  <t  =  0: 

lim/9=l/\/5,  lim  p  =  (1  —  4  tan  /3  +  8  tan2  /3)“^/2  (4.8.27) 

T— *>0  r-^oo 

This  is  confirmed  by  numerical  experiments.  With  a  -  1/2  we  have  a  robust  smoother  for 
the  convection-diffusion  equation.  Alternating  ILU,  to  be  discussed  shortly,  may,  however, 
be  more  efficient.  With  ct  =  0,  p  <C  1  except  in  a  small  neighbourhood  of  /3  =  0°  and 
P  =  180°.  Since  in  practice  r  remains  finite,  some  smoothing  effect  remains.  For  example, 
for  s  =  0.1  (/?  ~  174.3),  h  =  1/64  and  e  =  10“®  we  have  r  ~  156  and  (4.8.27)  gives  p  ~  0.82. 
This  explains  why  in  practice  seven-point  ILU  with  tr  =  0  is  a  satisfactory  smoother  for  the 
convection-diffusion  equation  but  cr  =  1/2  gives  a  better  smoother. 

Nine-point  ILU 

Assume 

f  9  P 

[A]=  c  d  q  (4.8.28) 

z  a  b 

Reasoning  as  before,  we  have 

'oool  rooo  C  P  T 

[L]=  7^0  0^0  ,  U=  0  6  p  (4.8.29) 

[u  a  /3  \  [OOOJ  [000 

For  u,a,...,T  we  have  equations  (4.4.22)  in  [141],  here  interpreted  as  equations  for  scalar 
unknowns.  The  relevant  solution  of  three  equations  may  be  obtained  as  the  limit  of  the 
following  recursion 

oo  =  «,  /?o  =  b,  7o  =  c,  So  =  d,  po  =  q,  Co  =  /»  Vo  =  9 

=  ct  —  zpjISj,,  —  b  —  dj^ipj  /  Sj 

Ij+i  =  c  -  {zvj  +  «i+iCj)/^j 

rij^i  =  {\/3j^ipj\  -f  \zCj\  -1-  |/?j+ip|  +  |7j-i-iCj|}/^j  (4.8.30) 

6j+i  =d-{zp  +  aj+iVj  + 

Pj+i  =  9  -  i^j+iP  +  (^j+iVj)/‘^j+i 
Cj+i  =  /  -  Ij+iVj/^j,  Vj+i  =9-  7j+iP/  ^i+i 

For  M  we  find  M  =  LD~^U  =  A  +  N,  with 

r  7C  0  0  0 

N  =  -  zC  0  an  0  fip  (4.8.31) 

^  0  0  0  /3p 


60 


with  n  =  I7CI  +  I^CI  +  I/^p1  +  The  amplification  factor  is  given  by 

A(0)  =  B{0)/{iB{e)  +  A{e)}  (4.8.32) 

where 

B{0)  =  {7C exp  [i{02  -  2^1)]  +  zC exp 

+/3pexp  {2i0i)  +  /?/texp  [i{20i  —  62]  +  an}! 6 

and 

A{6)  -  zexp  +  ^2)]  +  aexp  (-^^2)  +  6exp[i(0i  -  ^2)]  +  cexp  {-iOx) 

+d  +  gexp  {i6x)  +  f  exp  [i(02  -  ^1)]  +  sexp  (1^2)  +  pexp  [i{0i  +  62)] 

Anisotropic  diffusion  equation 

For  the  anisotropic  diffusion  equation  discretized  according  to  (4.5.6)  the  nine-point  ILU  fac¬ 
torization  is  identical  to  the  seven-point  ILU  factorization.  Table  4.8.5  gives  results  for  the 
case  that  the  mixed  derivative  is  discretized  according  to  (4.5.8).  In  this  case  seven-point  ILU 
performs  poorly.  When  the  mixed  derivative  is  absent  (/?  =  0°  or  =  90°)  nine-point  ILU 
is  identical  to  seven-point  ILU.  Therefore  Table  4.8.5  gives  only  the  worst  case  for  /3  in  the 
set  {/J  =  k/2Tr,  k  =  0, 1,2,  ...,23).  Clearly,  the  smoother  is  not  robust  for  cr  =  0.  But  also 
for  (7  =  1/2  there  are  values  of  P  for  which  this  smoother  is  not  very  effective.  For  example, 
with  finer  sampling  of  (3  around  75°  one  finds  a  local  maximum  of  approximately  pu  =  0.73 
for  j3  =  85°. 


£  P  l3  PD  13  p  PD  P 

1  0.13  any  0.12  any  0.11  any  0.11  any 

10"^  0.52  75°  0.50  75°  0.42  75°  0.42  60° 

10-2  750  134  750  0.63  75°  0.63  75° 

10-3  187  750  1  62  75°  0.68  75°  0.68  75° 

IQ-^  1.92  75°  1.66  75°  0.68  75°  0.68  75° 

Table  4.8.5:  Fourier  smoothing  factors  p,pD  for  the  rotated  anisoptropic  diffusion  equation 
discretized  according  to  (4.5.6),  but  the  mixed  derivative  discretized  according  to  (4.5.8); 
nine-point  ILU  smoothing;  n  =  64 
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Alternating  seven-point  ILU 

The  amplification  factor  of  the  second  part  (corresponding  to  the  second  backward  grid  point 
ordering  defined  by  (3.4.16))  of  alternating  seven-point  ILU  smoothing,  with  factors  denoted 
by  L,D,U,  may  be  determined  as  follows.  Let  [A]  be  given  by  (4.8.13).  The  stencil  repre¬ 
sentation  of  the  incomplete  factorization  discussed  in  Section  3.4  is 

■  0  7  1  _  Too  ]  _  r  C  0 

[I]=  0  ^  a  ,  [D]=  0  6  0  ,[!/]=  ^  ^  0  (4.8.33) 

[  0  /?  J  L  ^  ^  J  L  ^  0  . 

In  [141]  it  is  shown  that  ...,^  are  given  by  (4.8.15)  and  (4.8.16),  provided  the  following 
substitutions  are  made: 

a-*  q,  b  ^  b,  g,  d  d,  q  a,  f  f,  g  c  (4.8.34) 

The  iteration  matrix  is  M  =  LD~^U  —  A  +  N.  According  to  [141], 

’  P2 

0  0  0 

[N]=  0  p3  0  (4.8.35) 

0  0  0 

Pi . 

with  pi  =  ^p/6,  p2  =  jC/b,  p3  =  cr{\pi  \  +  |P2|)-  It  follows  that  the  amplification  factor  X{0) 
of  the  second  step  of  alternating  seven-point  ILU  smoothing  is  given  by 

X{0)  =  {p3  +  pi  exp  [i(0i  -  292)]  +  p2  exp  [i(26>2  -  9i )]}/ 

{aexp  (-i^2)  +  ^>exp  [i(6i  -  ^2)]  +  cexp  (t^i)  +  d  +  P3  +  gexp  (iO) 

+f  exp  [-i{9i  -02)]  +  g  exp  (1^2)  +  Pi  exp  [i{9i  -  2^2)] 

-I-P2  exp  [i(202  -  ^1)}  (4.8.36) 

The  amplification  factor  of  alternating  seven-point  ILU  is  given  by  A(0)A(0),  with  A(0)  given 
by  (4.8.19). 

Anisotropic  diffusion  equation 

Table  4.8.6  gives  some  results  for  the  rotated  anisotropic  diffusion  equation.  The  worst  case 
for  /?  in  the  set  =  kn/12,  k  =  0, 1,2,  ...,23}  is  included.  We  see  that  with  a  =  0.5  we  have 
a  robust  smoother  for  this  test  case.  Similar  results  (not  given  here)  are  obtained  when  the 
mixed  derivative  is  approximated  by  (4.5.8)  with  alternating  nine-point  ILU. 
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e 

<T 

P 

P  =  0°,90° 

PD 

13  =  0°,90° 

P,PD 

P 

1 

0 

9  X  10-^ 

9  X  10-"^ 

9  X  10-^ 

any 

10-^ 

0 

0.021 

0.021 

0.061 

30° 

10-2 

0 

0.041 

0.024 

0.25 

45° 

10”^ 

0 

0.057 

3  X  10-3 

0.61 

45° 

10-^ 

0 

0.064 

10-° 

0.94 

45° 

1 

0.5 

4  X  10-3 

4  X  10-3 

4  X  10-3 

any 

10-^ 

0.5 

0.014 

0.014 

0.028 

15° 

10-2 

0.5 

0.20 

0.012 

0.058 

45° 

10"^ 

0.5 

0.026 

2  X  10-3 

0.090 

45° 

10-® 

0.5 

0.028 

0 

0.11 

45° 

Table  4.8.6:  Fourier  smoothing  factors  p,pD  for  the  rotated  anisotropic  diffusion  equation 
discretized  according  to  (4.5.6);  alternating  seven-point  ILU  smoothing;  n  =  64 

Convection-diffusion  equation 

Symmetry  considerations  imply  that  the  second  step  of  alternating  seven-point  ILU  smoothing 
has,  for  £  <  l,/>  —  1  for  /3  around  90°  and  270°.  Here,  however,  the  first  step  has  p  <  1. 
Hence,  we  expect  the  alternating  smoother  to  be  robust  for  the  convection-diffusion  equation. 
This  is  confirmed  by  the  results  of  Table  4.8.7.  The  worst  case  for  /3  in  the  set  {/3  =  kTr/12  : 
fc  =  0, 1, 2, ...,  23}  is  listed. 

To  sum  up,  alternating  modified  point  ILU  is  robust  and  very  efficient  in  aU  cases.  The  use 
of  alternating  ILU  has  been  proposed  in  [97]. 


(7  —  0 

a  =  0.5 

e 

P,PD 

P,PD 

/? 

1.0 

9  X  10-^ 

0° 

4  X  10-=^ 

0° 

10-^ 

9  X  10-3 

0° 

4  X  10-3 

0° 

10-2 

0.019 

105° 

7  X  10-3 

0° 

10-3 

0.063 

105° 

0.027 

O 

o 

10-° 

0.086 

O 

O 

0.036 

105° 

Table  4.8.7:  Fourier  smoothing  factors  p,pD  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  alternating  seven-point  ILU  smoothing;  n  =  64 
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Modification  has  been  analyzed  and  tested  in  [65],  [97],  [83],  [82],  [145]  and  [147]. 


4.9  Incomplete  block  factorization  smoothing 

For  the  smoothing  analysis  of  incomplete  block  factorization  we  refer  to  [141].  We  present 
some  results. 

Anisotropic  diffusion  equation 

Tables  4.9.1  and  4.9.2  give  results  for  the  two  discretizations  (4.5.6)  and  (4.5.8)  of  the  rotated 
anisotropic  diffusion  equation.  The  worst  cases  for  /?  in  the  set  {/?  =  A:7r/12,  A;  =  0, 1,  ...,23} 
are  included.  In  cases  where  the  elements  of  D  do  not  settle  down  quickly  to  values  indepen¬ 
dent  of  location  as  one  moves  away  from  the  grid  boundaries,  so  that  in  these  cases  Fourier 
smoothing  analysis  is  not  realistic. 


e 

p 

P 

/3  =  90° 

P,^ 

PD 

/3  =  0° 

PD 

/}  =  90° 

Pd,I3 

1 

0.058 

0.058 

0.058,  any 

0.056 

0.056 

0.056,  any 

10-^ 

0.108 

0.133 

0.133,90° 

0.102 

0.116 

0.116,90° 

10-2 

0.149 

0.176 

0.131,45° 

0.195 

0.078 

0.131,45° 

10-^ 

0.164* 

0.194 

0.157*,  45° 

0.025* 

0.005 

0.157*,  45° 

10-® 

0.141 

0.120 

0.166*,  45° 

0° 

0 

0.166*,  45° 

Table  4.9.1:  Fourier  smoothing  factors  p^po  for  the  rotated  anisotropic  diffusion  equation 
discretized  according  to  (4.5.6);  IBLU  smoothing;  n  =  64.  The  symbol  *  indicates  that 
the  coefficients  do  not  become  constant  rapidly  away  from  the  boundaries;  therefore  the 
corresponding  value  is  not  realistic. 


Convection-diffusion  equation 

Table  4.9.3  gives  results  for  the  convection- diffusion  equation,  sampling  /?  as  before. 

If  is  clear  that  IBLU  is  an  efficient  smoother  for  all  cases.  This  is  confirmed  by  the  multigrid  • 
results  presented  in  [107]. 

4,10  Fourier  analysis  of  white-black  and  zebra  Gauss-Seidel  smoothing 

The  Fourier  analysis  of  white- black  and  zebra  Gauss-Seidel  smoothing  requires  special  treat¬ 
ment,  because  the  Fourier  modes  as  defined  in  Section  4.3  are  not  invariant  under  these 
iteration  methods.  The  Fourier  analysis  of  these  methods  is  discussed  in  detail  in  [112].  They 
use  sinusoidal  Fourier  modes.  The  resulting  analysis  is  applicable  only  to  special  cases  of  the 
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e 

p 

/3  =  0° 

11 

0 

0 

PD 

/i  =  0° 

II 

CO 

0 

0 

1 

0.058 

0.058 

0.056 

0.056 

10-^ 

0.108 

0.133 

0.102 

0.116 

10-2 

0.49 

0.176 

0.096 

0.078 

10-3 

0.164* 

0.194 

0.025* 

5  X  10-2 

10-5 

0.141* 

0.200 

0.000* 

0.000 

Table  4.9.2:  Fourier  smoothing  factors  /o,  pD  for  the  rotated  anisotropic  diffusion  equation  dis¬ 
cretized  according  to  (4.5.6)  but  with  mixed  derivative  according  to  (4.5.8);  IBLU  smoothing; 
n  =  64.  The  symbol  *  has  the  same  meaning  as  in  the  preceding  table. 


e 

p 

/? 

PD 

/S 

1.0 

0.058 

0* 

0.056 

0'^ 

10-* 

0.061 

0” 

0.058 

0“ 

10-2 

0.092 

0® 

0.090 

0° 

10-2 

0.173 

0® 

0.121 

0° 

10-® 

0.200 

0*^ 

10-2 

152 

Table  4.9.3:  Fourier  smoothing  factors  p,  pD  for  the  convection-diffusion  equation  discretized 
according  to  (4.5.11);  IBLU  smoothing;  n  =  64. 


set  of  test  problems  defined  in  Section  4.5.  Therefore  we  will  continue  to  use  exponential 
Fourier  modes. 

The  amplification  matrix 

Specializing  to  two  dimensions  and  assuming  Ui  and  n2  to  be  even,  we  have 

ipj{e)  =  exp  {ije)  (4.10.1) 

with 

j  =  (jl,i2),  ja  =  0,1, 2,...,  He  -  1  (4.10.2) 

and 

^  G  0  —  ^ (^1  ?  ^2)9  ^Oi! '^Oi  >  ^CX  —  "^CX  "h  ^  '^CX  ^4,10.3) 
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where  ma  =  na/2  —  1.  Define 


e  0j  =  0  n  [-7r/2, 7r/2)^  ,  0'^  =  d^- 
(  sign{e\y  )  ’  ( 


(  sign{0\)T:  \ 
sign{0\)T:  j 
sign{0\)-K  \ 

0  ) 


(4.10.4) 


where  sign(t)  =  -1,  <  <  0;  sign{t)  =  1,  f  >  0.  Note  that  0*  almost  coincides  with  the  set  of 
smooth  wavenumbers  0^  defined  by  (4.4.20).  As  we  will  see,  Span  {^{0^),  ip{0^),  ^(^^)} 

is  left  invariant  by  the  smoothing  methods  considered  in  this  section. 

Let  il}{0)  =  (i){0^),  V’(^^))^-  the  Fourier  representation  of  an  arbitrary  periodic 

grid  function  (4.3.7)  can  be  written  as 

“j  =  S  (4.10.5) 

ff€03 


with  c$  a  vector  of  dimension  4. 

If  the  error  before  smoothing  is  cj'^(0),  then  after  smoothing  it  is  given  by  {A{0)cg)'^-il?(0), 
with  A{0)  a  4  X  4  matrix,  called  the  amplification  matrix. 


The  smoothing  factor 

The  set  of  smooth  wavenumbers  0^  has  been  defined  by  (4.4.20).  Comparison  with  0^  as 
defined  by  (4.10.4)  shows  that  ^(0*^),  k  =  2,3,4  are  rough  Fourier  modes,  whereas  i>{0^) 
is  smooth,  except  when  0\  =  -7r/2  or  0\  =  -7r/2.  The  projection  operator  on  the  space 
spanned  by  the  rough  Fourier  modes  is,  therefore,  given  by  the  following  diagonal  matrix 


{ m 


Q{0)  = 


ly 


(4.10.6) 


with  8{0)  =  1  if  =  -7r/2  and  02  =  — 7r/2,  and  6{0)  =  0  otherwise.  Hence,  a  suitable 
definition  of  the  Fourier  smoothing  factor  is 


p  =  max{x(Q(^)A(0))  :  0  €  0a}  (4.10.7) 


with  X  the  spectral  radius. 

The  influence  of  Dirichlet  boundary  conditions  can  be  taken  into  account  heuristicaHy  in  a 
similar  way  as  before.  Wavenumbers  of  the  type  (0,^2)  (^i>0),  s  =  1,3,4,  are  to  be 
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disregated  (note  that  =  0  cannot  occur),  that  is,  the  corresponding  elements  of  ce  are  to 
be  replaced  by  zero.  This  can  be  implemented  by  replacing  QA  by  PQA  with 


P{9)  = 


( piW  ^ 

1  0 

0  pz{e) 

\  ) 


(4.10.8) 


where  pi(0)  =  0  if  =  0  and/or  02  =  0,  and  pi{0)  =  1  otherwise;  p3{0)  =  0  if  =  0  (hence 
0^  —  0),  and  P3{0)  =  1  otherwise;  similarly,  P4{0)  =  0  if  02  =  0  (hence  0^  =  0))  P4(0)  =  1 

otherwise.  The  definition  of  the  smoothing  factor  in  the  case  of  Dirichlet  boundary  conditions 
can  now  be  given  as 

PD  =  max  {x(P(0)Q(0)A(0))  :  0  €  0^}  (4.10.9) 

Analogous  to  (4.4.23)  a  mesh-size  independent  smoothing  factor  p  is  defined  as 


p  =  sup{x(Q(0)A(0))  :  0  €  0  J  (4.10.10) 


with  0s  =  (— 7r/2,7r/2)^. 

White-black  Gauss- Seidel 

Let  A  have  the  five-point  stencil  given  by  (4.8.1)  with  h  —  /  =  0.  The  use  of  white- 
black  Gauss-Seidel  makes  no  sense  for  the  seven-point  stencil  (4.8.1)  or  the  nine-point  stencil 
(4.8.28),  since  the  unknowns  in  points  of  the  same  colour  cannot  be  updated  independently. 
For  these  stencil  multi-coloured  Gass-Seidel  can  be  used,  but  we  will  not  go  into  this. 

Define  grid  points  (jiijz)  with  ji  +  j2  even  to  be  white  and  the  remainder  black.  We 
wiU  study  white-black  Gauss- Seide  with  damping.  Let  e®  be  the  initial  error  ,  the  error 
after  the  white  step,  the  error  after  the  black  step,  and  the  error  after  damping  with 
parameter  w.  Then  we  have 

=  -(a£,^_e, +C£?_ei h+h  even  (4.10.11) 

h+h  odd. 

The  relation  between  and  is  obtained  from  (4.10.11)  by  interchanging  even  and  odd. 
The  final  error  is  given  by 

e]  =  +  (1-  oj)e^j  (4.10.12) 

Let  the  Fourier  representation  of  e",  o  —  0, 1/3, 2/3,1  be  given  by 

E"  =  E  ■ 
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(4.10.13) 


If  5  =  1, 2, 3  or  4,  then 


=  n{e^)i)j{e^),  ji  +  j2  even 
=  h+h  odd 


with  ii{9)  —  —[a exp  (— *^2)  +  cexp  (-i9i)  +  gexp  (z^i)  +  irexp  {i92)/d.  Hence 

+  1)  exp  {ij9^)  +  ^{^l{9^)  -  1) 


X  exp  [ij\{9{  -  tt)]  exp  [ij2(^|  -  tt)] 


SO  that 


^14"  /^i  — 1  —  0  0 

1/3  _  1  /^i  -  1  1  “  Ml  0  0 

2  0  0  H-//2  -1-/^2 

I  0  0  ;i2-l  1-/^2 


c° ,  ^  e  0s 


(4.10.14) 


(4.10.15) 


where  //i  =  //(0),  H2  =  (aexp  (— ^^2)  —  cexp  (— *^i)  —  gexp  (i^i)  +  pexp  (i02))/d.  If  the 
black  step  is  treated  in  a  similar  way  one  finds,  combining  the  two  steps  and  incorporating 
the  damping  step, 

c\  -  {ijjA{9)  +  (1  -  a;)/}c®  (4.10.16) 


A{9)  = 


/  |ii(l  +  |ii)  + 

1  /ii(l-/ii)  /fi(/ii-l) 

2  0  0 
0  0 


0  0 

0  0 

M2(1  +  /^2)  -A‘2(H-/i2) 

M2(1  -  M2)  M2(M2  -  1) 


Hence 


(4.10.17) 


P{9)Q{9)A(e) 

(  Pi%(l+Mi)  -Pi^Mi(l  +  Mi)  0  0 

^1  //i(l-Mi)  Mi(Mi-1)  0  0 

2  0  0  P3M2(H-M2)  -P3M2(1  +  M2) 

\  0  0  P4M2(1  -  M2)  P4M2(M2  -  1) 

( 

The  eigenvalues  of  PQA  are 

■^i(^)  =  0,  A2(^)  =  -Mi{Mi  ~  1  +  Mi^(l  +  Mi)}» 


(4.10.18) 


A3(<?)  =  0,  ^4(0)  =  -M2[P3  -  M4  +  M2(P3  +  M4)} 


(4.10.19) 
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and  the  two  types  of  Fourier  smoothing  factor  are  found  to  be 


/),/)£)  =  max  {|a;A2(0)  +  1  -  a;|,  |a;A4(0)  +  1  -  a;|  :  0  G  Qs}  (4.10.20) 

where  pi  =  pa  =  p4  =  1  in  (4.10.19)  gives  p,  and  choosing  pi,  pa,  p4  as  defined  after  equation 
(4.10.8)  gives 


With  u  =  1  we  have  p  =  pD  =  1/4  for  Laplace’s  equation  [112].  This  is  better  than 
lexicographic  Gauss-Seidel,  for  which  p  =  1/2  (Section  4.7).  Furthermore,  obviously,  white- 
black  Gauss-Seidel  lends  itself  very  well  for  vectorized  and  parallel  computing.  This  fact, 
combined  with  the  good  smoothing  properties  for  the  Laplace  equation,  has  led  to  some  of 
the  fastest  Poisson  solvers  in  existence,  based  on  multigrid  with  white- black  smoothing  [13], 
[113]. 

Convection-diffusion  equation 

With  /?  =  0  equation  (4.5.11)  gives  a  =  — e,  c  =  —e  —  d  =  4e  +  q  =  — e,  g  =  —e,  so 
that  pi,2(0,  — 7r/2)  =  (2  +  P)/(4  +  P),  with  P  =  h/e  the  mesh  Peclet  number.  Hence,  with 
Pi  =  P3  =  p4  =  1  we  have  A2,4(o,  — 7r/2)  =  (2  -h  P)^/(4  +  P)^,  so  that  p  -h-  1  as  P  oo  for  all 
a;,  and  the  same  is  true  for  pD,  Hence  white- black  Gauss-seidel  is  not  a  good  smoother  for 
this  test  problem. 


Smoothing  factor  of  zebra  Gauss-Seidel 
Let  A  have  the  following  nine-point  stencil: 

f  3  P 

[A]=  c  d  q  (4.10.21) 

z  a  b 

Let  us  consider  horizontal  zebra  smoothing  with  damping.  Define  grid  points  (71,^2)  with  j2 
even  to  be  white  and  the  remainder  to  be  black.  Let  e®  be  the  initial  error,  the  error 
after  the  ^white’  step,  the  error  after  the  ‘black’  step,  and  the  error  after  damping 
with  parameter  a;.  Then  we  have 


cey_%^dey^^qe]il 


+ 


+  **^j+ei-e2  +  M-ei+e2  +  +  I’^j+ei+ej)’ 


even 
j2  odd 


where  ej  =  (1,0)  and  62  =  (0, 1). 


(4.10.22) 

(4.10.23) 


The  relation  between  and  is  obtained  from  (4.10.23)  by  interchanging  even  and 
odd,  and  the  final  error  is  given  by  (4.10.12). 

It  turns  out  that  zebra  iteration  leaves  certain  two-dimensional  subspaces  invariant  in  Fourier 
space.  In  order  to  facilitate  the  analysis  of  alternating  zebra,  for  which  the  invariant  subspaces 
are  the  same  as  for  white-black,  we  continue  the  use  of  the  four- dimensional  subspaces  •0(^) 
introduced  earlier. 


In  [141]  it  is  shown  that  the  eigenvalues  of  P{6)Q{6)A{B)  are 


Ai(^)  =  0,  A2(^)  —  5Pi^/xi(l  + /ii)  -  |p3Mi(1  —  Ml))  A3(^)  —  0 

Xi{0)  =  |m2(1  +  M2)  +  iP4M2(M2  -  1) 


(4.10.24) 


with 


Mi(^)  —  ~  exp(— i(0i  -I-  ^2)]  +  «  exp(-*02)  +  b  exp[i(^i  —  ^2)] 

-I-  /  exp[i(^2  -  ^1)]  +  9  exp(^2)  +  P  exp[i(^i  -|-  $2)}/ 

[c  exp(— i^i)  +  d  +  q  exp(i0i)] 

and  M2  =  Mi(^i  -  ir,  ^2  -  ^)- 

The  two  types  of  Fourier  smoothing  factor  are  given  by  (4.10.20),  taking  A2,A4  from 
(4.10.24). 

Anisotropic  diffusion  equation 

For  £  =  1  (Laplace’s  equation),  w  =  1  (no  damping)  and  Pi  =  Pa  =  P4  =  1  (periodic  boundary 
conditions)  we  have  Mi(^)  =  cos  02/(2  — cos  0i)  and  M2(^)  =  —  cos  02/(2-l-cos  0i) .  One  finds 
max  {|A2(0)|  :  0  €  ©«}  =  |A2(ir/2,0)|  =  \  and  max  {|A4(0)|  :  0  G  Oj}  =  |A4(ir/2,7r/2)|  =  A,so 
that  the  smoothing  factor  is  p  =  p  =  |. 

For  e  <C  1  and  the  rotation  angle  yS  =  0  we  have  strong  coupling  in  the  vertical  direction,  so 
that  horizontal  zebra  smoothing  is  not  expected  to  work.  We  have  M2(^)  =  -  cos  02/(1  +  £  + 
£cos  0i),  so  that  |A4(7r/2,0)|  =  (1  -|-£)“^,  hence  limpc  >  1.  Furthermore,  with  (p  =  27r/n, 

we  have  |A4(7r/2),  p)\  =  cos^  ip/(l  -|-  £)^,  so  that  limpo  >  1  —  0{h?).  Damping  does  not  help 

here.  We  conclude  that  horizontal  zebra  is  not  robust  for  the  anisotropic  diffusion  equation, 
and  the  same  is  true  for  vertical  zebra,  of  course. 

Convection-diffusion  equation 

With  convection  angle  /?  =  7r/2  in  (4.5.11)  we  have 

M2(^)  =  [(1  +  i^)exp  {-192)  +  exp  (i02)]/(4+  P  +  2cos  ^i)  , 


70 


where  P  =  hje  is  the  mesh  Peclet  number.  With  p4  =  1  (periodic  boundary  conditions)  we 
have  A4  =  so  that  A4(?r/2,0)  =  (2  +  P)2/(4  +  P)^,  and  we  see  thata>A4(7r/2,0)  +  l-a;  w  1 
for  P  >  1,  so  that  P  ^  1  for  P  >  1  for  all  cj.  Hence,  zebra  smoothing  is  not  suitable  for  the 
convection-diffusion  equation  at  large  mesh  Peclet  number. 


Smoothing  factor  of  alternating  zebra  Gauss-Seidel 

As  we  saw,  horizontal  zebra  smoothing  does  not  work  when  there  is  strong  coupling  (large 
diffusion  coefficient  or  strong  convection)  in  the  vertical  direction.  This  suggests  the  use  of 
alternating  zebra:  horizontal  and  vertical  zebra  combined.  Following  the  suggestion  in  [112], 
we  will  arrange  alternating  zebra  in  the  following  ‘symmetric’  way:  in  vertical  zebra  we  do 
first  the  ‘black’  step  and  then  the  ‘white’  step,  because  this  gives  slightly  better  smoothing 
factors,  and  leads  to  identical  results  for  /?  =  0°  and  /9  =  90°.  The  4x4  amplification  matrix 
of  vertical  zebra  is  found  to  be 


I/l(l/l  -b  1) 
0 
0 

-  1) 


0 

i^2(^'2  +  1) 
V2{U2  -  1) 
0 


0 

V2{U2  +  1) 
V2{V2  -  1) 
0 


^l{v\  +  1)  '' 
0 
0 

-  1)  / 


(4.10.25) 


where 


j/i(6l)  =  -{zexp  [-i{d\  +  O2)]  +  ^»exp  [i{0i  -  ^2)]  +  cexp  {-idi) 

-fgexp  {i0i)  +  /exp  [i{02  -  ^1)]  +  pexp  [f(0i  +  ^2)]}/ 

[a exp  (-*^2)  +  d  +  g exp  (^^2)] 


and  i'2{0)  =  vi(0i-7r,O2-T^)-  We  will  consider  two  types  of  damping:  damping  the  horizontal 
and  vertical  steps  separately  (to  be  referred  to  as  double  damping)  and  damping  only  after 
the  two  steps  have  been  completed.  Double  damping  results  in  an  amplification  matrix  given 
by 

A  =  PQ[(1  -  u}d)I  +  i^dAy\[{\  -  u>d)I  +  oJdAh]  (4.10.26) 

where  Ah  is  given  in  [141].  In  the  case  of  single  damping,  put  =  1  in  (4.10.26)  and  replace 
A  by 

A  :  {1  -  u>s)I  +  u>sA  (4.10.27) 

The  eigenvalues  of  the  4x4  matrix  A  are  easily  determined  numerically. 

Anisotropic  diffusion  equation 

Tables  4.10.1  and  4.10.2  give  results  for  the  smoothing  factors  p,  pp  for  the  rotated  anisotropic 
diffusion  equation.  The  worst  cases  for  the  rotation  angle  in  the  set  {/?  =  kirll2,  k  = 
0, 1, 2,  ...,23}  are  included.  For  the  results  of  Table  4.10.1  no  damping  was  used.  Introduction 
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of  damping  (ud  ^  1  or  7^  1)  gives  no  improvement.  However,  as  shown  by  Table  4.10.2,  if 
the  mixed  derivative  is  discretized  according  to  (4.5.8)  good  results  are  obtained.  For  cases 
with  e  =:  I  OT  (3  ^  0^  OT  (3  =  90®  the  two  discretizations  are  identical  of  course,  so  for  these 
cases  without  damping  Table  4.10.1  applies.  For  Table  4.10.2  (3  has  been  sampled  with  an 


e 

p 

/?  =  0°,90O 

PD 

P  =  0°,90° 

PiPD 

/? 

1 

0.048 

0.048 

0.048 

any 

10“^ 

0.102 

0.100 

0.480 

45" 

10-2 

0.122 

0.121 

0.924 

45° 

10-3 

0.124 

0.070 

0.992 

45° 

10-® 

0.125 

0.001 

1.000 

45° 

Table  4.10.1:  Fourier  smoothing  factors  p^po  for  the  rotated  anisotropic  diffusion  equation 
discretized  according  to  (4.5.6);  alternating  zebra  smoothing;  n  =  64 


=  1 

Ws  =  0.7 

e 

P,PD 

0 

P,PD 

0  =  0°,90° 

P^PD 

0 

1 

0.048 

any 

0.317 

0.317 

any 

10-1 

0.229 

30° 

0.302 

0.460 

34° 

10-2 

0.426 

14° 

0.300 

0.598 

14° 

10-3 

0.503 

8° 

0.300 

0.653 

8° 

10-® 

0.537 

4° 

0.300 

0.668 

8° 

10-3 

0.538 

4° 

0.300 

0.668 

8° 

Table  4.10.2:  Fourier  smoothing  factors  p^pn  for  the  rotated  anisotropic  diffusion  equation 
discretized  according  to(4.5.6)  but  with  the  mixed  approximated  by  (4.5.8);  alternating  zebra 
smoothing  with  single  damping;  n  =  64 

interval  of  2®.  Symmetry  means  that  only  f3  G  [0®,45®]  needs  to  be  considered.  Results  with 
single  damping  {us  =  0.7)  are  included.  Clearly,  damping  is  not  needed  in  this  case  and 
even  somewhat  disadvantageous.  As  will  be  seen  shortly,  this  method,  however,  works  for  the 
convection  diffusion  test  problem  only  if  damping  is  applied.  Numerical  experiments  show 
that  a  fixed  value  of  =  0.7  is  suitable,  and  that  there  is  not  much  difference  between  single 
damping  and  double  damping.  We  present  results  only  for  single  damping. 
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Convection-diffusion  equation 

For  Table  4.10.3,  /3  has  been  sampled  with  intervals  of  2^;  the  worst  cases  are  presented.  The 
results  of  Table  4.10.3  show  that  alternating  zebra  without  damping  is  a  reasonable  smoother 


=  1 

Us  =  0.7 

e 

p 

PD 

/? 

P^PD 

1 

0.048 

0" 

0.048 

0° 

0.317 

0° 

10-1 

0.049 

0® 

0.049 

0° 

0.318 

20° 

10-2 

0.080 

o 

00 

0.079 

26° 

0.324 

42° 

10-3 

0.413 

24° 

0.369 

28° 

0.375 

44° 

10-3 

0.948 

4° 

0.584 

22° 

0.443 

4° 

10-3 

0.995 

2° 

0.587 

22° 

0.448 

4° 

Table  4.10.3:  Fourier  smoothing  factors  p  for  the  convection- diffusion  equation  discretized 
according  to  (4.5.11);  alternating  zebra  smoothing  with  single  damping;  n  =  64 

for  the  convection- diffusion  equation.  If  the  mesh  Peclet  numbers  hcos  /J/e  or  hsin  /?/e 
becomes  large  (>  100,  say),  p  approaches  1,  but  pD  remains  reasonable. 

A  fixed  damping  parameter  =  0.7  gives  good  results  also  for  p.  The  value  =  0.7  was 
chosen  after  some  experimentation. 

We  see  that  with  Ug  =  0.7  alternating  zebra  is  robust  and  reasonably  efficient  for  both 
the  convection- diffusion  and  the  rotated  anisotropic  diffusion  equation,  provided  the  mixed 
derivative  is  discretized  according  to  (4.5.8). 

4.11  Multistage  smoothing  methods 

As  we  will  see,  multistage  smoothing  methods  are  also  of  the  basic  iterative  method  type 
(3.1.3)  (of  the  semi-iterative  kind,  as  will  be  explained),  but  in  the  multigrid  literature  they 
are  usually  looked  upon  as  techniques  to  solve  systems  of  ordinary  differential  equations, 
arising  from  the  spatial  discretization  of  systems  of  hyperbolic  or  almost  hyperbolic  partial 
differential  equations. 

The  convection-diffusion  test  problem  (4.5,4)  is  of  this  type,  but  (4.5.3)  is  not.  We  wiU, 
therefore,  consider  the  application  of  multistage  smoothing  to  (4.5.4)  only.  Multistage  meth¬ 
ods  have  been  introduced  in  [74]  for  the  solution  of  the  Euler  equations  of  gas  dynamics,  and 
as  smoothing  methods  in  a  multigrid  approach  in  [71].  For  the  simple  scalar  test  problem 
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(4.5.4)  multistage  smoothing  is  less  efficient  than  the  better  ones  of  the  smoothing  methods 
discussed  before.  The  simple  test  problem  (4.5.4),  however,  lends  itself  well  for  explaining 
the  basic  principles  of  multistage  smoothing,  which  is  the  purpose  of  this  section. 

Artificial  time-derivative 

The  basic  idea  of  multistage  smoothing  is  to  add  a  time-derivative  to  the  equation  to  be 
solved,  and  to  use  a  time-stepping  method  to  damp  the  short  wavelength  components  of  the 
error.  The  time-stepping  method  is  of  multistage  (Runge-Kutta)  type.  Damping  of  short 
waves  occurs  only  if  the  discretization  is  dissipative,  which  implies  that  for  hyperbolic  or 
almost  hyperbolic  problems  some  form  of  upwind  discretization  must  be  used,  or  an  artificial 
dissipation  term  must  be  added.  Such  measures  are  required  anyway  to  obtain  good  solutions. 
The  test  problem  (4.5.4)  is  replaced  by 

Ou 

-  e(u  11  +  M, 22)  +  CM  1  -I-  SU2  =  f  (4.11.1) 

Spatial  discretization  according  to  (4.5.10)  or  (4.5.11)  gives  a  system  of  ordinary  differential 
equations  denoted  by 

^  =  -h-^Au+f  (4.11.2) 

at 

where  A  is  the  operator  defined  in  (4.5.10)  or  (4.5.11);  u  is  the  vector  of  grid  function  values. 
Multistage  method 

The  time-derivative  in  (4.11.2)  is  an  artefact;  the  purpose  is  to  solve  Au  =  h^f.  Hence,  the 
temporal  accuracy  of  the  discretization  is  irrelevant.  Denoting  the  time-level  by  a  superscript 
n  and  stage  number  A;  by  a  superscript  (k),  ap-stage  (Runge-Kutta)  discretization  of  (4.11.2) 
is  given  by 

=  -It" 

-u^*)  =  -  Ckuh~^  -t-  CkAtf,  k  =  1,2,  ...,p 

tt"+i  =  (4.11.3) 

with  Cp  =  1.  Here  v  =  Atjh  is  the  so-called  Courant-Frederichs-Lewy  (CFL)  number.  Elimi¬ 
nating  this  can  be  rewritten  as 

=  Pp{-vh-^  A)u^  Qp-^{-uh-^  A)f  (4.11.4) 

with  the  amplification  polynomial  Pp  a  polynomial  of  degree  p,  defined  by 

Pp(z)  =  l  +  z{l  +  Cp_iz(l  -I-  Cp-2Z{...(1  +  Ciz)...)  (4.11.5) 

and  Qp-\  is  polynomial  of  degree  p  —  1  which  plays  no  role  in  further  discussion. 
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Semi-iterative  methods 

Obviously,  equation  (4.1  L6)  can  be  interpreted  as  an  iterative  method  for  solving  h'^'^Au  =  / 
of  the  type  introduced  in  Section  4.1  with  iteration  matrix 

S  =  (4.1L6) 

Such  methods,  for  which  the  iteration  matrix  is  a  polynomial  in  the  matrix  of  the  system  to 
be  solved,  are  called  semi-iterative  methods.  See  [129]  for  the  theory  of  such  methods.  For 
p  =  1  (one-stage  method)  we  have 

S  =  I-uh-^A  (4.11.7) 

which  is  in  fact  the  damped  Jacobi  method  (Section  4.3)  with  diagonal  scaling  {diag  (A)  =  /), 
also  known  as  the  one-stage  Richardson  method.  As  a  solution  method  for  differential  equa¬ 
tions  this  is  known  as  the  forward  Euler  method.  Following  the  trend  in  the  multigrid  liter¬ 
ature,  we  win  analyse  method  (4.11.3)  as  a  multistage  method  for  differential  equations,  but 
the  analysis  could  be  couched  in  the  language  of  linear  algebra  just  as  well. 

The  amplification  factor 

The  time  step  At  is  restricted  by  stability.  In  order  to  assess  this  stability  restriction  and  the 
smoothing  behaviour  of  (4.11.4),  the  Fourier  series  (4.3.7)  is  substituted  for  u.  It  suffices  to 
consider  only  one  component  u  =  '0(^),  0  £  Q.  We  have  uh~^  A^f{0)  =  With 

A  defined  by  (4.5.11)  one  finds 

ju(0)  =  4s  +  h(\c\  +  1^1)  -  {2e  +  h\c\)  cos  0i 

— (2s  +  h\s\)  cos  02  +  ihc  sin  6i  +  ihs  sin  62  (4.11.8) 

and 

g(e)u^  (4.11.9) 

with  the  amplification  factor  g(6)  given  by 

g(9)  =  P^(-upi(e)lh)  (4.11.10) 

The  smoothing  factor 

The  smoothing  factor  is  defined  as  before: 

p  =  max  {\g(0)\  :  0  €  ©r)  (4.11.11) 

in  the  case  of  periodic  boundary  conditions,  and 

pD  =  max{|si(0)|  :  0  €  ©f }  (4.11.12) 
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for  Dirichlet  boundary  conditions. 

Stability  condition 
Stability  requires  that 

\g(e)\  <1,  G  0  (4.11.13) 

The  stability  domain  D  of  the  multistage  method  is  defined  as 

D  =  {zec  :  \Pp{z)  <  1}  (4.11.14) 

Stability  requires  that  i'  is  chosen  such  that  z  =  £  D,  'iO  £  Q.  If  p  <  1  but 

(4.11.13)  is  not  satisfied,  rough  modes  are  damped  but  smooth  modes  are  amplified,  so  that 
the  multistage  method  is  unsuitable. 

Local  time-stepping 

When  the  coefficients  c  and  s  in  the  convection-diffusion  equation  (4.11.1)  are  replaced  by 
general  variable  coefficients  Vi  and  V2  (in  fluid  mechanics  applications  vi,V2  are  fluid  velocity 
components),  an  appropriate  definition  of  the  CFL  number  is 

i/  =  vAtfh,  u  =  |ui| -H  |u2|  (4.11.15) 

Hence,  if  At  is  the  same  in  every  spatial  grid  point,  as  would  be  required  for  temporal  ac¬ 
curacy,  u  will  be  variable  if  v  is  not  constant.  For  smoothing  purposes  it  is  better  to  fix  v 
at  some  favourable  value,  so  that  At  will  be  different  in  different  grid  points  and  on  different 
grids  in  multigrid  applications.  This  is  called  local  time-stepping. 

Optimization  of  the  coefficients 

The  stability  restriction  on  the  CFL  number  v  and  the  smoothing  factor  p  depend  on  the 
coefficients  Cfe.  In  the  classical  Runge-Kutta  methods  for  solving  ordinary  differential  equa¬ 
tions  these  are  chosen  to  optimize  stability  and  accuracy.  For  analyses  see  for  example  [115], 
[106].  For  smoothing  Ck  is  chosen  not  to  enhance  accuracy  but  smoothing;  smoothing  is  also 
influenced  by  i/.  The  optimum  values  of  u  and  Ck  are  problem  dependent.  Some  analysis 
of  the  optimization  problem  involved  may  be  found  in  [127].  In  general,  this  optimization 
problem  can  only  be  solved  numerically. 

We  proceed  with  a  few  examples. 

A  four-stage  method 

Based  upon  an  analysis  of  Catalano  and  Deconinck  (prive-communication),  in  which  optimal 
coefficients  Ck  and  CFL  number  v  are  sought  for  the  upwind  discretization  (4.5.11)  of  (4.11.1) 
with  £:  =  0,  we  choose 

Cl  =  0.07,  C2  =  0.19,  C3  =  0.42,  =  2.0  (4.11.16) 
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e 

/3  =  0® 

/?  =  15® 

II 

CO 

0 

0 

/3  =  45® 

0 

1.00 

0.593 

0.477 

0.581 

10"® 

0.997 

0.591 

0.482 

0.587 

Table  4.11.1:  Smoothing  factor  p  for  (4.11.1)  discretized  according  to  (4.5.11);  four-stage 
method;  n  =  64 


Table  4.11.1  gives  some  results. 


It  is  found  that  differs  very  little  from  p.  It  is  not  necessary  to  choose  /?  outside 
[0°,45*^],  since  the  results  are  symmetric  in  /3.  For  s  ~  10“^  the  method  becomes  unstable 
for  certain  values  of  /?.  Hence,  for  problems  in  which  the  mesh  Peclet  number  varies  widely 
in  the  domain  it  would  seem  necessary  to  adopt  Ck  and  v  to  the  local  stencil.  With  e  =  0  aU 
multistage  smoothers  have  p  =  1  for  grid-aligned  flow  {[3  -  0°  or  90®)  :  waves  perpendicular 
to  the  flow  are  not  damped. 

A  five- stage  method 

The  following  method  has  been  proposed  in  [73]  for  a  central  discretization  of  the  Euler 
equations  of  gas  dynamics: 

Cl  =  1/4,  C2  =  1/6,  C3  =  3/8,  C4  =  1/2  (4.11.17) 


The  method  has  also  been  applied  to  the  compressible  Navier-Stokes  equations  in  [75].  We 
wiD  apply  this  method  to  test  problem  (4.11.1)  with  the  central  discretization  (4.5.10).  Since 
p,{6)  =  ih{csm  0i  +  ssin  O2)  we  have  p(0, tt)  =  0,  hence  |p(0,7r)|  =  1,  so  that  we  have  no 
smoother.  An  artificial  dissipation  term  is  therefore  added  to  (4.11.2),  which  becomes 


with 


——  — —h  "^Au  —  h  ^Bu  +  f 
at 


[B]  =  X 


1 

-4 

1  -4  12  -4  1 

-4 
1 


(4.11.18) 


(4.11.19) 


where  x  is  a  parameter. 

We  have  B'ijy{0)  =  with 
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0®  15^^  45^ 

p  0.70  0.77  0.82  0.82 


Table  4.11.2:  Smoothing  factor  p  for  (4.11.1)  discretized  according  to  (4.5.10);  five-stage 
method;  n  =  64 

r){e)  =  4x[(l  -  cos  ^i)^  +  (1  -  cos  $2^]  (4.11.20) 

For  reasons  of  efficiency  the  artificial  dissipation  term  is  updated  in  [73]  only  in  the  first  two 
stages.  This  gives  the  following  five-stage  method: 

ui^)  =  -u(o)-cfci/(/i“^A-f-B)«('=-i),  A:  =1,2  14  11  oil 

uW  =  A:  =  3,4,5  ^ 

The  amplification  polynomial  now  depends  on  two  arguments  ^1,2:2  defined  by  zi  =  vh~'^p{9), 
Z2  =  vq{d),  and  is  given  by  the  following  algorithm: 

P\  =  \  —  C\{Z\  Z2),  F2  =  1  —  C2{Zi  -H  Z2)Pl 

P3  =  1  —  C3Z1P2  —  C3Z1P2  —  C3Z2Pl,  P4  =  1  —  C4^1^3  —  C42:2Fi  (4.11.22) 

P5{Zl,Z2)  =  1  -  ZiP4  -  Z2P1 

In  one  dimension  Jameson  and  Baker  [73]  advocate  u  —  Z  and  x  ~  0.04;  for  stability  v  should 
not  be  much  larger  than  3.  In  two  dimensions  max  {i'h~^\p(9)\}  =  1/(0 +  s)  <  v\f2.  Choosing 
vy/2  =  3  gives  v  ~  2.1.  With  v  =  2.1  and  x  =  0.04  we  obtain  the  results  of  Table  4.11.2,  for 
both  £  =  0  and  e  =  10"^.  Again,  po  ^  P-  This  method  allows  only  £  <  1;  for  example,  for 
e  =  10“^  and  jS  =  45°  we  find  p  =  0.96. 
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Final  remarks 

Advantages  of  multistage  smoothing  are  excellent  vectorization  and  parallelization  poten¬ 
tial,  and  easy  generalization  to  systems  of  differential  equations.  Multistage  methods  are  in 
widespread  use  for  hyperbolic  and  almost  hyperbolic  systems  in  computational  fluid  dynam¬ 
ics.  They  are  not,  however,  robust,  because,  like  aU  point-wise  smoothing  methods,  they  do 
not  work  when  the  unknowns  are  strongly  coupled  in  one  direction  due  to  high  mesh  aspect 
ratios.  Also  their  smoothing  factors  are  not  small.  Various  strategems  have  been  proposed  in 
the  literature  to  improve  multistage  smoothing,  such  as  residual  averaging,  including  implicit 
stages,  and  local  adaptation  of  Ck^  but  we  will  not  discuss  this  here;  see  [73],  [75]  and  [127]. 

4.12  Concluding  remarks 

In  this  chapter  Fourier  smoothing  analysis  has  been  explained,  and  efficiency  and  robustness 
of  a  great  number  of  smoothing  methods  has  been  investigated  by  determining  the  smoothing 
factors  p  and  pD  for  the  two-dimensional  test  problems  (4.5.3)  and  (4.5.4).  The  following 
methods  work  for  both  problems,  assuming  the  mixed  derivative  in  (4.5.3)  is  suitably  dis¬ 
cretized,  either  with  (4.5.6)  or  (4.5.8): 

(i)  Damped  alternating  Jacobi; 

(ii)  Alternating  symmetric  line  Gauss- Seidel; 

(iii)  Alternating  modified  incomplete  point  factorization; 

(iv)  Incomplete  block  factorization; 

(v)  Alternating  damped  zebra  Gauss-Seidel. 

Where  damping  is  needed  the  damping  parameter  can  be  fixed,  independent  of  the  problem. 
It  is  important  to  take  the  type  of  boundary  condition  into  account.  The  heuristic  way  in 
which  this  has  been  done  within  the  framework  of  Fourier  smoothing  analysis  correlates  well 
with  multigrid  convergence  results  obtained  in  practice. 

Generalization  of  incomplete  factorization  to  systems  of  differential  equations  and  to  non¬ 
linear  equations  is  less  straightforward  than  for  the  other  methods.  Application  to  the  incom¬ 
pressible  Navier-Stokes  equations  has,  however,  been  worked  out  in  [144],  [146],  [148],  [150] 
and  [149],  and  is  discussed  in  [141]. 

Of  course,  in  three  dimensions  robust  and  efficient  smoothers  are  more  elusive  than  in  two 
dimensions.  Incomplete  block  factorization,  the  most  powerful  smoother  in  two  dimensions, 
is  not  robust  in  three  dimensions  [81].  Robust  three-dimensional  smoothers  can  be  found 


79 


among  methods  that  solve  accurately  in  planes  (plane  Gauss-Seidel)  [114].  For  a  successful 
multigrid  approach  to  a  complicated  three-dimensional  problem  using  ILU  type  smoothing, 
see  [124],  [122],  [125],  [123]. 

5  Prolongation,  restriction  and  coarse  grid  approximation 

5.1  Introduction 

In  this  chapter  the  transfer  operations  between  fine  and  coarse  grids  are  discussed. 

Fine  grids 

The  domain  in  which  the  partial  differential  equation  is  to  be  solved  is  assumed  to  be  the 
d-dimensional  unit  cube.  In  the  case  of  vertex-centered  discretization,  the  computational  grid 
is  defined  by 

G  =  {x  e  :x  =  jh,  j  =  iji,j2,—,jd),  h  =  {hi,h2,  ...,hrf), 

ja  ~  0)  I5  2, ...,  Kq,,  hgi  =  Iffioi,  oi  1,2,...,  d}  (5.1.1) 

In  the  case  of  cell-centered  discretization,  G  is  defined  by 

G  =  {x  e  R"^ :  X  =  {j  -  s)h,  j  =  s  =  (1, 1, ...,  l)/2, 

h  =  (/ij ,  /l2>  •••»  ^d)^  ja  ~  1?  2,  ...,  TIqi,  hgi  =  Xj Tlai^  O!  —  1,2,...,  d)  (5.1.2) 

These  grids,  on  which  the  given  problem  is  to  be  solved,  are  called  fine  grids.  Without  danger 
of  confusion,  we  will  also  consider  G  to  be  the  set  of  d-tuples  j  occuring  in  (5.1.1)  or  (5.1.2). 

Coarse  grids 

In  this  chapter  it  suffices  to  consider  only  one  coarse  grid.  From  the  vertex-centered  grid 

(5.1.1)  a  coarse  grid  is  derived  by  vertex-centered  coarsening,  and  from  the  cell-centered  grid 

(5.1.2)  a  coarse  grid  is  derived  by  cell-centered  coarsening.  Coarse  grid  quantities  will  be 
identified  by  an  overbar.  Vertex-centered  coarsening  consists  of  deleting  every  other  vertex  in 
each  direction.  Cell-centered  coarsening  consists  of  taking  unions  of  fine  grid  cells  to  obtain 
coarse  grid  cells.  Figures  5.1.1  and  5.1.2  give  an  illustration.  It  is  assumed  that  na  in  (5.1.1) 
and  (5.1.2)  is  even. 

Denote  spaces  of  grid  function  by  U : 

U  =  {u:G  ^  R},  U  =  {u:G-^R}  (5.1.3) 

The  transfer  operators  are  denoted  by  P  and  R: 

P-.U^U,  R:U^U  (5.1.4) 
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1  2  3  4  1  2 


Figure  5.1.1:  Vertex-centered  and  cell-centered  coarsening  in  one  dimension.  (•  grid  points) 


P  is  called  prolongation,  and  R  restriction. 

Here  only  vertex-centered  coarsening  will  be  discussed.  For  ceU-centered  coarsening,  see 
[141]. 

5.2  Stencil  notation 

In  order  to  obtain  a  concise  description  of  the  transfer  operators,  stencil  notation  will  be  used. 
Stencil  notation  for  operators  of  type  U  U 

Let  A  :  U  >  U  be  a  linear  operator.  Then,  using  stencil  notation,  Au  can  be  denoted  by 

(A-u),- =  ^  A{i,j)ui+j,  ieG  (5.2.1) 

with  Z  =  {0, ±1, ±2, ...}.  The  subscript  i  —  {ii,i2,  —tid)  identifies  a  point  in  the  computa¬ 
tional  grid  in  the  usual  way;  cf.  Figure  5.1.2  for  the  case  d  =  2. 

The  set  Sa  defined  by 

^A  =  O'  €  :3i  eG  with  A{i,j)  7^  0}  (5.2.2) 

is  called  the  structure  of  A.  The  set  of  values  A{i,j)  with  j  €  is  called  the  stencil  of  A 
at  grid  point  i.  Often  the  word  ’stencil’  refers  more  specifically  to  an  array  of  values  denoted 
by  [-A]*  in  which  the  values  of  A{i,j)  are  given;  for  example,  in  two  dimensions, 

A(i,  — ei -f  62)  A(«,  62) 

[A]i=  A(i,-ei)  A(i,0)  A(i,ei)  (5.2.3) 

A(j, -62)  A(i,ei-e2)_ 

where  ei  =  (1,0)  and  e^  =  (0,1).  For  the  representation  of  three-dimensional  stencils,  see 
[141]. 
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Vertex-centred 


Cell-centred 


Figure  5.1.2:  Vertex-centered  and  cell-centered  coarsening  in  two  dimensions.  (•  grid  points.) 


Example  5.2.1  One-dimensional  discrete  Laplacian: 

[A]i  =  h-^[-l  2  -1] 


(5.2.4) 


Stencil  notation  for  restriction  operators 

Let  R  :  U  U  he  a  restriction  operator.  Then,  using  stencil  notation,  Ru  can  be  represented 

{Ru)i  =  ^  R{i,j)u2i+j,  i  £  G  (5.2.5) 


Example  5.2.2  Consider  vertex-centered  grids  for  d  -  1  as  defined  by  (5.1.1)  and  as 
depicted  in  Figure  5.1.1.  Let  R  be  defined  by 


Rui  =  WiU2i-i  +  2'“2t  +  eiU2t+i,  i  —  0, 1,  ...,n/2 


(5.2.6) 
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with  wq  =  0;  Wj  =  1/4,  i  ^  0;  e,-  =  1/4,  i  ^  n/2;  e„/2  =  0.  Then  we  have  (cf.  (5.2.5)): 

R{i,-l)  =  wi,  R{i,0)=l/2,  R{i,l)  =  ei  (5.2.7) 

or 

[R]i  =  K-  1/2  a]  (5.2.8) 

We  can  also  write  [i2]  =  [1  2  l]/4  and  stipulate  that  stencil  elements  that  refer  to  values 

of  u  at  points  outside  G  are  to  be  replaced  by  0. 

The  relation  between  the  stencil  of  an  operator  and  that  of  its  adjoint 

For  prolongation  operators,  a  nice  definition  of  stencil  notation  is  less  obvious  than  for  restric¬ 
tion  operators.  As  a  preparation  for  the  introduction  of  a  suitable  definition  we  first  discuss 
the  relation  between  the  stencil  of  an  operator  and  its  adjoint.  Define  the  inner  product  on 
U  in  the  usual  way: 

{u,v)-  ^  UiVi  (5.2.9) 

i€Z‘‘ 

where  u  and  v  are  defined  to  be  zero  outside  G.  Define  the  transpose  A*  of  A  :  u  — >  f/  in 
the  usual  way  by 

{Au,v)  =  {u,A*v),  \/u,v  €  U  (5.2.10) 

Defining  A{i,j)  =  0  for  i  /  G  or  j  ^  Sa  we  can  write 

(Au,t?)  =  E  E  =  E  E  A{i,k-i)ukVi 

("5  2  111 

=  E  E  A{i,k-i)vi^{u,A^v)  ^  ‘  ^ 

k^2Z  i^2Z 

with 

(A*u)fc=  Y.  A{i,k-i)vi=  Y  A{i  +  K-i)vk^i=  Y  A*{k,i)vk+i  (5.2.12) 

ieZ'^ 

Hence,  we  obtain  the  following  relation  between  the  stencils  of  A  and  A*: 

A*{k,i)  =  A{k  +  i,-i)  (5.2.13) 

Stencil  notation  for  prolongation  operators 

\i  R  :U  Uf  then  R*  :  U  —>■  U  is  a,  prolongation.  The  stencil  of  R*  is  obtained  in  similar 
fashion  as  that  of  A*.  Defining  R{i,  j)  =  0  ioi  i  ^  G  or  j  ^  Sr,  we  have 

(/2u.,  u)  =  ^  ^  R{i,  j^U2i-\-jUi  —  ^  ^  Rii,k  2i)uf;Vi 

i,k€Z^  /r  2 

=  E  “fc  E  Rii,k-2i)vi  =  iu,R*v)  K  ■  ■  ) 


83 


with  R*  :U  ^  U  defined  by 

iR*v)k=  S  Rihk-2i)vi  (5.2.15) 

i€ZS‘‘ 

Equation  (5.2.15)  shows  how  to  define  the  stencil  of  a  prolongation  operator  P  :U  ^  U: 

{Pu)i=  (5-2-16) 

Hence,  a  convenient  way  to  define  P  is  by  specifying  P*.  Equation  (5.2.16)  is  the  desired 
stencil  notation  for  prolongation  operators. 

Suppose  a  rule  has  been  specified  to  determine  Pu  for  given  ti,  then  P*{k,  in)  can  be  obtained 
as  follows.  Choose  u  =  s,s  follows 

^j[  =  l,  ^j=  =  0,  j^k  (5.2.17) 

Then  (5.2.16)  gives  P*{k,i-  2k)  =  {P6)i,  or 

P*ik,j)  =  {PS%k+j,  keG,j£G.  (5.2.18) 

In  other  words,  [P*]k  is  precisely  the  image  of  S'"  under  P. 

The  usefulness  of  stencil  notation  will  become  increasingly  clear  in  what  follows. 

Exercise  5.2.1  Verify  that  (5.2.13)  and  (5.2.15)  imply  that,  if  A  and  R  are  represented  by 
matrices.  A*  and  R*  follow  from  A  and  R  by  interchanging  rows  and  columns.  (Remark:  for 
d  =  1  this  is  easy;  for  d  >  1  this  exercise  is  a  bit  technical  in  the  case  of  R). 

Exercise  5.2.2  Show  that  if  the  matrix  representation  of  A  :  U  — >  17  is  symmetric,  then  its 
stencil  has  the  property  A(k,  i)  —  A{k  +  f,  —i). 

5.3  Interpolating  transfer  operators 

We  begin  by  giving  a  number  of  examples  of  prolongation  operators,  based  on  interpolation. 

Let  d  =  1,  and  let  G  and  G  be  vertex-centred  (cf.  Figure  5.1.1).  Defining  P  :  U  ^  U  hy 
linear  interpolation,  we  have 

(Pu)2i  =  Hi,  {Pu)2i+1  =  +  ^i+l)  (5.3.1) 

Using  (5.3.1)  we  find  that  the  stencil  of  P*  is  given  by 

[FI  =  ^[1  2  1]  (5.3.2) 
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In  two  dimensions,  linear  interpolation  is  exact  for  functions  f{xi,X2)  =  and  takes 

place  in  triangles,  cf.  Figure  5.3.1.  Choosing  triangles  ABD  and  ACD  for  interpolation,  one 
obtains  ua  =  uaj  Ua  =  etc.  Alternatively,  one  may  choose 

C  d  D 
bee 
A  a  B 

Figure  5.3.1:  Interpolation  in  two  dimensions,  vertex-centered  grids.  (Coarse  grid  point: 
capital  letters;  fine  grid  points:  capital  and  lower  case  letters.) 

triangles  ABC  and  BDC,  which  makes  no  essential  difference.  Bilinear  interpolation  is  exact 
for  functions  f{xijX2)  =  1> ^1,2:2, 2:1X2,  and  takes  place  in  the  rectangle  ABCD.  The  only 
difference  with  linear  interpolation  is  that  now  Ue  =  \{ua  +  ub  +  +  '^d)-  Iii  other  words: 

«2i+ei+e2  =  \{Ui  +  Wt+ei  +  Wi+e2  +  ^t+ei+ej))  with  61  =  (1,0)  and  62  =  (0,  1). 

The  stencil  for  bilinear  interpolation  is 

(5.3.3) 

For  the  three-dimensional  case,  see  [141]. 

Restrictions 

We  can  be  brief  about  restrictions.  One  may  simply  take 

R  =  aP*  (5.3.4) 

with  a  a  suitable  scaling  factor.  The  scaling  of  R,  i.e.  the  value  of  is  important. 

If  Ru  is  to  be  a  coarse  grid  approximation  of  u  (this  situation  occurs  in  non-linear  multigrid 
methods,  which  wiU  be  discussed  later,  then  one  should  obviously  have  ^jii(i,j)  =  1.  If 
however,  R  is  used  to  transfer  the  residual  v  to  the  coarse  grid,  then  the  correct  value  of 
R{i^j)  depends  on  the  scaling  of  the  coarse  and  fine  grid  problems.  The  rule  is  that  the 
coarse  grid  problem  should  be  consistent  with  the  differential  problem  in  the  same  way  as 
the  fine  grid  problem.  This  means  the  following.  Let  the  differential  equation  to  be  solved  be 
denoted  as 


Lu  =  s 

(5.3.5) 

and  the  discrete  approximation  on  the  fine  grid  by 

11 

(5.3.6) 
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[p*]  = 


1  2  1 
2  4  2 
1  2  1 


Suppose  that  (5.3.6)  is  scaled  such  that  it  is  consistent  with  h^Lu  =  h^s  with  h  a  measure  of 
the  mesh-size  of  G.  Finite  volume  discretization  leads  naturally  to  a  =  d  with  d  the  number 
of  dimensions;  often  (5.3.6)  is  scaled  in  order  to  get  rid  of  divisions  by  h.  Let  the  discrete 
approximation  of  (5.3.5)  on  the  coarse  grid  G  be  denoted  by 


Au  =  Rb  (5.3.7) 

and  let  A  approximate  h^L.  Then  Rb  should  approximate  h^s.  Since  b  approximates 
we  find  a  scaling  rule,  as  follows. 

Rule  scaling  of  R: 

^R(U)  =  Ch/kr  (5.3.8) 

j 

We  emphasize  that  this  rule  applies  only  if  R  is  to  be  applied  to  right-hand  sides  and/or 
residuals.  Depending  on  the  way  the  boundary  conditions  are  implemented,  at  the  boundaries 
a  may  be  different  from  the  interior.  Hence  the  scaling  of  R  should  be  different  at  the 
boundary.  Another  reason  why  R(i,j)  may  come  out  different  at  the  boundary  is  that 
use  is  made  of  the  fact  that  due  to  the  boundary  conditions  the  residual  to  be  restricted  is 
known  to  be  zero  in  certain  points. 

A  restriction  that  cannot  be  obtained  by  (5.3.4)  with  interpolating  prolongation  is  injection: 

{Ru)i  =  cru2i  (5.3.9) 

Accuracy  condition  for  transfer  operators 

The  proofs  of  mesh-size  independent  rate  of  convergence  of  MG  assume  that  P  and  R  satisfy 
certain  conditions  [21],  [57].  The  last  reference  (p.  149)  gives  the  following  simple  condition: 

mp  +  mp  >  2m  (5.3.10) 

A  necessary  condition  (not  discussed  here)  is  given  in  [66].  Here  orders  mp,mR  of  P  and  R 
are  defined  as  the  highest  degree  plus  one  of  the  polynomials  that  are  interpolated  exactly  by 
P  or  sR*,  respectively,  with  s  a  scaling  factor  that  can  be  chosen  freely,  and  2m  is  the  order 
of  the  partial  differential  equation  to  be  solved.  For  example,  (5.3.9)  has  mp  =  0,  (5.3.3)  has 
mp  =  2.  Practical  experience  (see  e.g.  [139])  confirms  that  (5.3.10)  is  necessary. 
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Operator-dependent  transfer  operators 

If  the  coefficients  in  the  differential  equations  are  discontinuous  across  certain  interfaces  be¬ 
tween  subdomains  of  different  physical  properties,  then  u  ^  and  linear  interpolation 

across  discontinuities  in  is  inaccurate.  (See  [141]  for  more  details).  Instead  of  interpola¬ 
tion,  operator-dependent  prolongation  has  to  be  used.  Such  prolongations  aim  to  approximate 
the  correct  jump  condition  by  using  information  from  the  discrete  operator.  They  are  required 
only  in  vertex-centered  multigrid,  but  not  in  ceU-centered  multigrid,  as  shown  in  [141],  where 
a  full  discussion  of  operator-dependent  transfer  operators  may  be  found. 

5,4  Coarse  grid  Galerkin  approximation 

The  problem  to  be  solved  on  the  fine  grid  is  denoted  by 

Au  =  /  (5.4.1) 

The  two-grid  algorithm  (2.3.14)  requires  an  approximation  A  of  A  on  the  coarse  grid.  There 
are  basically  two  ways  to  chose  A,  as  already  discussed  in  Chapter  2. 

(i)  Discretization  coarse  grid  approximation(DCA):  like  A,  A  is  obtained  by  discretization 
of  the  partial  differential  equation. 

(ii)  Galerkin  coarse  grid  approximation  (GCA): 

A  =  RAP  (5.4.2) 

A  discussion  of  (5.4.2)  has  been  given  in  Chapter  2. 

The  construction  of  A  with  DCA  does  not  need  to  be  discussed  further.  We  will  use  stencil 
notation  to  obtain  simple  formulae  to  compute  A  with  GCA.  The  two  methods  wiU  be 
compared,  and  some  theoretical  back-ground  wiU  be  given. 

Explicit  formula  for  coarse  grid  operator 

The  matrices  R  and  P  are  very  sparse  and  have  a  rather  irregular  sparsity  pattern.  Stencil 
notation  provides  a  very  simple  and  convenient  storage  scheme.  Storage  rather  than  repeated 
evaluation  is  to  be  recommended  if  R  and  P  are  operator-dependent.  We  wiU  derive  formulae 
for  A  using  stencil  notation.  We  have  (cf.  (5.2.16)) 

{Pu)^^P*{j,i-2j)uj  (5.4.3) 

3 

Unless  indicated  otherwise,  summation  takes  place  over  2Z^.  Equation  (5.2.1)  gives 

{APu)i  =  Y,  Mh  k){Pu)i+k  =  E  E  ^  (5.4.4) 

k  k  3 
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(5.4.5) 


Finally,  equation  (5.2.5)  gives 

{RAPu)i  =  E  ■K(i,  m){APu)2i+m 

=  E  E  E  rn)A{2i  +  m,  k)P*{3, 2i  +  m  +  A;  -  2j  )% 

m  k  j 

With  the  change  of  variables  j  =  i  +  n  this  becomes 

m)A{2i  +  m,  k)P*{i  +  n,  m  +  (5.4.6) 

m  k  n 

from  which  it  follows  that 

A{i,n)  =  EE«(  i,  m)A{2i  +  m, k)P*{i  +  n,m  +  k  —  2n)  (5.4.7) 

m  k 

For  calculation  of  A  by  computer  the  ranges  of  m  and  k  have  to  be  finite.  is  the  structure 
of  A  as  defined  in  (5.2.2),  and  Sji  is  the  structure  R,  i.e. 

Sji  z={j  :3ieG  with  RiiJ)  0}  (5.4.8) 

Equation  (5.4.7)  is  equivalent  to 

A{i,n)=  Yj  R{h'm)A{2i  +  m,k)P*{i  +  n,m  +  k-2n)  (5.4.9) 

With  this  formula,  computation  of  A  is  straightforward,  as  we  will  now  show. 

Calculation  of  coarse  grid  operator  by  conaputer 

For  efficient  computation  of  A  it  is  useful  to  first  determine  S^.  This  can  be  done  with  the 
following  algorithm 

Algorithm  STRURAP 

comment  Calculation  of  5^ 
begin  5^  =/0 

for  q  €  Sp*  do 

for  m  G  Sp  do 

for  k  G  do 

begin  n  =  (m  A  k  —  g)/2 

if  (n  G  Z!‘^)  then  U  n 

end 


od  od  od 
end STRURAP 
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Having  determined  it  is  a  simple  matter  to  compute  A.  This  can  be  done  with  the  fol¬ 
lowing  algorithm. 


Algorithm  CALRAP 

comment  Calculation  of  A 
begin  A  =  0 

for  n  G  do 

for  m  G  Sji  do 

for  k  G  Sj^  do 

q  =  m  +  k  —  2n 

if  g  G  S p*  then  _  _ 

G\  =  {i  G  (5  :  2i  -t-  m  G  G-}  C\\i^G'.  G} 

for  i  £  Gi  do 

A{i,  n)  =  A{i,  n)  +  R{i,  m)A{2i  +  m,  k)P*{i  +  n,  q) 

od  od  od 
end  CALRAP 

Keeping  computation  on  vector  and  parallel  machines  in  mind,  the  algorithm  has  been  de¬ 
signed  such  that  the  innermost  loop  is  the  longest. 

To  illustrate  how  G\  is  obtained  we  given  an  example  in  two  dimensions.  Let  G  and  G  be 
given  by 

G  =  {i  G  22^2  :  0  <  *1  <  2ni,  0  <  Z2  <  2»2} 

G  =  {i  £  :  0  <  ii  <  Til,  0  <  12  <  TI2} 

Then  i  £  G\  is  equivalent  so 

max(-ja,-Tn„/2,0)  <  ia  <  min(na  -  171^/2,  ti^  -  a  =  1,2 

It  is  easy  to  see  that  the  inner  loop  vectorizes  along  grid  lines. 

Comparision  of  discretization  and  Galerkin  coarse  grid  approximation 

Although  DCA  seems  more  straightford,  GCA  has  some  advantages.  The  coarsest  grids  em¬ 
ployed  in  multigrid  methods  may  be  very  coarse.  On  such  very  coarse  grids  DCA  may  be 
unreliable  if  the  coefficients  are  variable,  because  these  coefficients  are  sampled  in  very  few 
points.  An  example  where  multigrid  fails  because  of  this  effect  is  given  in  [137].  The  situation 
can  be  remedied  by  not  sampling  the  coefficients  pointwise  on  the  coarse  grids,  but  taking 
suitable  averages.  This  is,  however,  precisely  that  GCA  does  accurately  and  automaticaly. 
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For  the  same  reason  GCA  is  to  be  used  for  interface  problems  (discontinuous  coefficients), 
in  which  case  the  danger  of  pointwise  sampling  of  coefficients  is  most  obvious.  Another  ad¬ 
vantage  of  GCA  is  that  it  is  purely  algebraic  in  nature;  no  use  in  made  of  the  underlying 
differential  equation.  This  opens  the  possibility  of  developing  autonomous  or  ’black  box’ 
multigrid  subroutines,  requiring  as  input  only  a  matrix  and  right-hand  side.  On  the  other 
hand,  for  non-linear  problems  and  for  systems  of  differential  equations  there  is  no  general  way 
to  implement  GCA.  Both  DCA  and  GCA  are  in  widespread  use. 


Structure  of  coarse  grid  operator  stencil 

Galerkin  coarse  grid  approximation  will  be  useful  only  if  5^  is  not  (much)  larger  than  Saj 
otherwise  the  important  property  of  MG,  that  computing  work  is  proportional  to  the  number 
of  unknowns,  may  get  lost.  For  examples  and  further  discussion  of  CGA,  including  the  possible 
loss  of  the  A"-matrix  property  on  coarse  grids,  see  [141]. 

6  Multigrid  algorithms 

6.1  Introduction 

The  order  in  which  the  grids  are  visited  is  called  the  multigrid  schedule.  Several  schedules  will 
be  discussed.  All  multigrid  algorithms  are  variants  of  what  may  be  called  the  basic  multigrid 
algorithm.  This  basic  algorithm  is  nonlinear,  and  contains  linear  multigrid  as  a  special  case. 
The  most  elegant  description  of  the  basic  multigrid  algorithm  is  by  means  of  a  recursive 
formulation.  FORTRAN  does  not  allow  recursion,  thus  we  also  present  a  non-recursive  for¬ 
mulation.  This  can  be  done  in  many  ways,  and  various  flow  diagrams  have  been  presented  in 
the  literature.  If,  however,  one  constructs  a  structure  diagram  not  many  possibilities  remain, 
and  a  well  structured  non-recursive  algorithm  containing  only  one  goto  statement  results. 
The  decision  whether  to  go  a  finer  or  to  a  coarser  grid  is  taken  in  one  place  only. 

6.2  The  basic  two-grid  algorithm 
Preliminaries 

Let  a  sequence  {G^  :  k  =  1,2,  of  increasingly  finer  grids  be  given.  Let  be  the 

set  of  grid  functions  ^  IR  on  G^]  a  grid  function  6  stands  for  m  functions  in 
the  case  where  we  want  to  solve  a  set  of  equations  for  m  unknowns.  Let  there  be  given 
transfer  operators  (prolongation)  and  (restriction).  Let 

the  problem  to  be  solved  on  be  denoted  by 

L  V)  =  (6.2.1) 
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The  operator  may  be  linear  or  non-linear.  Let  on  every  grid  a  smoothing  algorithm  be 
defined,  denoted  by  S{u,v,  S  changes  an  initial  guess  into  an  improved  approxi¬ 

mation  with  right-hand  side  by  i/k  iterations  with  a  suitable  smoothing  method.  The 
use  of  the  same  symbol  u*'  for  the  solution  of  (6.2.1)  and  for  approximations  of  this  solution 
will  not  cause  confusion;  the  meaning  of  will  be  clear  from  the  context.  On  the  coarse 
grid  we  sometimes  wish  to  solve  (6.2.1)  exactly;  in  general  we  do  not  wish  to  be  specific 
about  this,  and  we  write  5(tt,  v,  /,•,!)  for  smoothing  or  solving  on  G^. 

The  nonlinear  two-grid  algorithm 

Let  us  first  assume  that  we  have  only  two  grids  G^  and  G^  The  following  algorithm  is  a 
generalization  of  the  linear  two-grid  algorithm  discussed  in  Section  2.3.  Let  some  approxima¬ 
tion  of  the  solution  on  G^  be  given.  How  •u*'  may  be  obtained  will  be  discussed  later.  The 
non-linear  two-grid  algorithm  is  defined  as  follows.  Let  =  b^. 

Subroutine  TG  (tt,  u,f,k) 
comment  nonlinear  two-grid  algorithm 
begin 

S{u,  u,f,i/,k) 

Choose  u^~^  ,Sk-\ 
fk-^  =  L^-\u^-^)  +  Sk-^R^-^r^ 

S{u,u,f,*,k-  1) 

=  'u''  +  (l/sfc_i)P''(u'=-^  -  -u^-^) 

S{u,u,f,n,  k) 
end  of  TG 

A  caD  of  TG  gives  us  one  two-grid  iteration.  The  following  program  performs  ntg  two-grid 
iterations: 

Choose  u'^ 
fk  =  b'^ 

for  i  =  1  step  1  until  ntg  do 
TG(u,u, /,  fc) 
u  =  u 
od 

Discussion 

Subroutine  TG  is  a  straightforward  implementation  of  the  basic  multigrid  principles  discussed 
in  Chapter  2,  but  there  are  a  few  subtleties  involved. 

We  proceed  with  a  discussion  of  subroutine  TG.  Statement  (1)  represents  Uk  smoothing  it- 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 
(7) 
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erations  (pre-smoothing),  starting  from  an  initial  guess  u^.  In  (2)  the  residual  is  com¬ 
puted;  is  going  to  steer  the  coarse  grid  correction.  Because  ’short  wavelength  accuracy’ 
already  achieved  in  must  not  get  lost,  is  to  be  kept,  and  a  correction  Su'^  (containing 
’long  wavelength  information’)  is  to  be  added  to  In  the  non-linear  case,  cannot  be 
taken  for  the  right-hand  side  of  the  problem  for  Su^;  L{Su^)  -  might  not  even  have  a 
solution.  For  the  same  reason,  cannot  be  the  right-hand  side  for  the  coarse  grid 

problem  on  Instead,  it  is  added  in  (4)  to  with  an  approximation 

to  the  solution  of  (6.2.1)  in  some  sense  (e.g.  Pu^~^  ~  solution  of  equation  (6.2.1).  Ob¬ 
viously,  L*“^(u*“^)  =  has  a  solution,  and  if  is  not  too  large,  then 

-I-  can  also  be  solved,  which  is  done  in  statement  (5)  (ex¬ 

actly  or  approximately). 

R^~^r^  wiU  be  small  when  is  close  to  the  solution  of  equation  (6.2.1),  i.e.  when  the 
algorithm  is  close  to  convergence.  In  order  to  cope  with  situations  where  is  not  smal 

enough,  the  parameter  sa:-i  is  introduced.  By  choosing  small  enough  one  can  bring  ^ 
arbitrarily  close  to  Hence,  solvability  of  can  be  ensured. 

Furthermore,  in  bifurcation  problems,  can  be  kept  on  the  same  branch  as  ^  by 

means  of  Sk-i  ■  In  (6)  the  coarse  grid  correction  is  added  to  u^.  Omission  of  the  factor  l/sk-i 
would  mean  that  only  part  of  the  coarse  grid  correction  is  added  to  u^,  which  amounts  to 
damping  of  the  coarse  grid  correction;  this  would  slow  down  convergence.  Finally,  statement 
(7)  represents  Hk  smoothing  iterations  (post-smoothing). 

The  linear  two-grid  algorithm 

It  is  instructive  to  see  what  happens  when  is  liilear.  It  is  reasonable  to  assume  that  then 
is  also  linear.  Furthermore,  let  us  assume  that  the  smoothing  method  is  linear,  that  is 
to  say,  statement  (5)  is  equivalent  to 

(6.2.2) 

with  some  linear  operator.  With  from  statement  (4)  this  gives 

+  sk-iB^-'^R'^-'^r'^  (6.2.3) 

Statement  (6)  gives 

=  -u*  -f  (6.2.4) 

and  we  see  that  the  coarse  grid  correction  R^~^ is  independent  of  the  choice  of 

Sk-\  and  in  the  linear  cas.  Hence,  we  may  as  well  choose  Sk-\  —  1  and  =  0  in  the 
linear  case.  This  gives  us  the  following  linear  two-grid  algorithm. 
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Subroutine  LTG  {u,u,f,k) 
comment  linear  two-grid  algorithm 
begin 

S{u,u,  f,u,  k) 

=  f^  -  L'^u^ 

=  0 

S{u,u,  f,»,k-  1) 
tt''  =  -u*  -t- 
S{u,u,  f,n,k) 
end  of  LTG 

Choice  of  u^~^  and  Sk-\ 

There  are  several  possibilities  for  the  choice  of  u^~^ .  One  possibility  is 

(6.2.5) 

where  r!‘~^  is  a  restriction  operator  which  may  or  may  not  be  the  same  as  . 

With  the  choice  sjt-i  =  1  this  gives  us  the  first  non-linear  multigrid  algorithm  that  has 
appeared,  the  FAS  (full  approximation  storage)  algorithm  proposed  by  Brandt  [20].  The  more 
general  algorithm  embodied  in  subroutine  TG,  containing  the  parameter  and  leaving  the 
choice  of  Uk-i  open,  has  been  proposed  by  Hackbusch  [54],  [49],  [57].  In  principle  it  is  possible 
to  keep  Uk-i  fixed,  provided  it  is  sufficiently  close  to  the  solution  of  =  b^~^.  This 

decreases  the  cost  per  iteration,  since  {u'^~^)  needs  to  be  evaluated  only  once,  but  the 
rate  of  convergence  may  be  slower  than  with  u'^~^  defined  by  (5).  We  will  not  discuss  this 
variant.  Another  choice  of  u^~^  is  provided  by  nested  iteration,  which  will  be  discussed  later. 
Hackbusch  [54],  [49],  [57]  gives  the  following  guidelines  for  the  choice  of  and  the  param¬ 
eter  Sfc_i.  Let  the  non-linear  equation  be  solvable  for  ||/*'“^||  <  Pk-i- 

Let  ||L*’“^(tt*^"^)||  <  pk-\l‘2..  Choose  Sk-i  such  that  <  Pk-i/^,  for  example: 

.  (6.2.6) 

Then  <  Pk-i,  so  that  the  coarse  grid  problems  has  a  solution. 

6.3  The  basic  multigrid  algorithm 
The  recursive  non-linear  multigrid  algorithm 

The  basic  multigrid  algorithm  follows  from  the  two-grid  algorithm  by  replacing  the  coarse 
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grid  solution  statement  (statement  (5)  in  subroutine  TG)  by  7^  multigrid  iterations.  This 
leads  to 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 


Subroutine  MGl  {u,  u,  f,  k,  7) 

comment  recursive  non-Unear  multigrid  algorithm 

begin 

if  (k  eg  1)  then 

S{u,u,  f,»,k) 

else 

S{u,u,  f,i/,  k) 

rk  =  fk_ 

Choose  u^~^,Sk-i 

/-I  + 

for  i  =  \  step  \until  jk  do 
MGl  {u,u,f,k-  1,7) 

od 

—  u^~^) 

S{u,u,f,fi,k) 
endif 
end  of  MGl 


After  our  discussion  of  the  two- grid  algorithm,  this  algorithm  is  self  explanatory. 

The  following  program  carries  out  nmg  multigrid  iterations,  starting  on  the  finest  grid  G^: 

Program  1: 

Choose 

f^  =  b^ 

for  i  =  1  step  1  until  nmg  do 
MGl  {u,u,f,K,'y) 

od 
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The  recursive  linear  multigrid  algorithm 

The  linear  multigrid  algorithm  follows  easily  from  the  linear  two-grid  algorithm  LTG: 

Subroutine  LMG  {u,u,f,k) 

comment  recursive  linear  midtigrid  algorithm 

begin 

if  {k  =  1)  then 
S{u,u,f,»,k) 

else 

S{u,u, 

f 

fk--\  ^  Rk-\^k 

=  0 

for  i  =  1  step  1  until  jk  do 
LMG  {u,uj,k-l) 

j^k-\  _ 

od 

S{u,  u,  f,p,k) 
endif 
end  LMG 

Here  u  plays  the  role  of  an  initial  guess. 

Multigrid  schedules 

The  order  in  which  the  grids  are  visited  is  called  the  multigrid  schedule  or  multigrid  cycle. 
If  the  parameters  7^;,  ^  =  1,2,...,^"  —  1  are  fixed  in  advance  we  have  a  fixed  schedule;  if  jk 
depends  on  intermediate  computational  results  we  have  an  adaptive  schedule.  Figure  6.3.1 
shows  the  order  in  which  the  grids  are  visited  with  jk  =  ^  and  7^  =  2,  A;  =  l,2,...,iif  — l,in  the 
case  K  =  4.  A  dot  represents  a  smoothing  operation.  Because  of  the  shape  of  these  diagrams, 
these  schedules  are  called  the  V-,  W-  and  sawtooth  cycles,  respectively.  The  sawtooth  cycle  is 
a  special  case  of  the  V-cycle,  in  which  smoothing  before  coarse  grid  correction  (pre-smoothing) 
is  deleted.  A  schedule  intermediate  between  these  two  cycles  is  the  F-cycle.  In  this  cycle  coarse 
grid  correction  takes  place  by  means  of  one  F-cycle  followed  by  one  V-cycle.  Figure  6.3.2  gives 
a  diagram  for  the  F-cycle,  with  K  =  5. 

Recursive  algorithm  for  V-,  F-  and  W-cycle 

A  version  of  subroutine  MGl  for  the  V-,  W-  and  F-cycles  is  as  follows.  The  parameter  7  is 
now  an  integer  instead  of  an  integer  array. 
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Figure  6.3.1:  V-,  W-  and  sawtooth-cycle  diagrams. 


Subroutine  MG2  {u,u,  f 

comment  nonlinear  multigrid  algorithm  V-,  W-,  or  F-cycle 
begin 

if  {k  eg  1)  then 
S{u,u,f,»,k) 

if  (cycle  eg  F)  then  7=1  endif 

else 

A 

for  i  =  \  step  1  until  7  do 
MG2  A: -1,7) 

od 
B 

if  (k  eg  K  and  cycle  eg  F)  then  7  =  2  endif 
endif 
end  MG2 

Here  A  and  B  represent  statements  (2)  to  (5)  and  (7)  and  (8)  in  subroutine  MGl.  The 
following  program  carries  out  nmg  V-,  W-,  or  F- cycles. 
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Figure  6.3.2:  F-cycle  diagram. 


Program  2: 

Choose 

f  =  h^ 

if  (cycle  eq  W  or  cycle  eg  F)  then  7  =  2  else  7  =  1 
for  i  =  1  step  1  until  nmg  do 
MG2  {u,u,  f, 
od 

Adaptive  schedule 

An  example  of  an  adaptive  strategy  is  the  following.  Suppose  we  do  not  carry  out  a  fixed 
number  of  multigrid  iterations  on  level  but  wish  to  continue  to  carry  out  multigrid 
interactions,  until  the  problem  on  is  solved  to  within  a  specified  accuracy.  Let  the  accuracy 
requirement  be 

-  /''II  <ek  =  (6.3.1) 

with  ^  G  (0, 1)  a  parameter. 

At  first  sight,  a  more  natural  definition  of  would  seem  to  be  Since  does 

not,  however,  go  to  zero  on  convergence,  this  would  lead  to  skipping  of  coarse  grid  correction 
when  approaches  convergence.  Analysis  of  the  linear  case  leads  naturally  to  condition 
(6.3.1).  An  adaptive  multigrid  schedule  with  criterion  (6.3.1)  is  implemented  in  the  following 
algorithm.  In  order  to  make  the  algorithm  finite,  the  maximum  number  of  multigrid  iterations 
allowed  is  7. 
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Subroutine  MG3  (it,  tt,  /,  k) 

comment  recursive  nonlinear  multigrid  algorithm  with  adaptive 

schedule 

begin 

if  (k  eq  1)  then 

S{u,u,  f,»,k) 

else 

A 

(1)  tk-i  =  Wr’^W  -  ek 

Sk-l  =  ike'll 
nk-i  =  7 

while  (4-1  >  0  and  Uk-i  >  0) 

MG3  (u,u,f,k-l) 

nk-i  =  Uk-i  -  1 

4-1  =  -  Sk-i 

od 

B 

endif 
end  MG3 

Here  A  and  B  stand  for  the  same  groups  of  statements  as  in  subroutine  MG2.  The  purpose 
of  statement  (1)  is  to  allow  the  possibility  that  the  required  accuracy  is  already  reached  by 
pre-smoothing  on  so  that  coarse  grid  correction  can  be  skipped.  The  following  program 
solves  the  problem  on  within  a  specified  tolerance,  using  the  adaptive  subroutine  MG3; 

Program  3  : 

Choose 

=  b^;  SK  =  tol  +  ||6^||  ;  tx  =  \\L^(u^)  -  b^\\  -  sk 
n  =  nmg 

while  {tx  >  0  and  u  >  0)  do 
MG3  {u,uJ,K) 
n  =  n  -  I 

tx  =  -  b^W  -  Sk 

od 

The  number  of  iterations  is  limited  by  mng. 

Storage  requirements 

Let  the  finest  grid  be  either  of  the  vertex-centered  type  given  by  (5.1.1)  or  of  the  cell- 
centered  type  given  by  (5.1.2).  Let  in  both  cases  no,  =  =  ma  ■  2^.  Let  the  coarse 


grids  G^,  k  =  K  —  1,K  —  2, 1  be  constructed  by  successive  doubling  of  the  mesh-sizes  ha 
(standard  coarsening).  Hence,  the  number  of  grid-points  Nk  of  is 

d 

iVjt  =  JJ  (1  -t-  •  2*=)  ~  (6.3.2) 

a=l 


in  the  vertex- centered  case,  with 


and 


d 

M  =  Y[ 

a=l 

Nk  =  M2'“^ 


(6.3.3) 


in  the  cell-centered  case.  In  order  to  be  able  to  solve  efficiency  on  the  coarsest  grid  G^  it  is 
desirable  that  nia  is  small.  Henceforth,  we  will  not  distinguish  between  the  vertex-centered 
and  ceU-centered  case,  and  assume  that  is  given  by  (6.3.3). 


It  is  to  be  expected  that  the  amount  of  storage  reqired  for  the  computations  that  take 
place  on  is  given  by  CiNk^  with  ci  some  constant  independent  of  k.  Then  the  total  amount 
of  storage  required  is  given  by 

K  Qd 

c\^Nk^  — -ciNk  (6.3.4) 

k=i  ^  ^ 

Hence,  as  compared  to  single  grid  solutions  on  method  selected,  the  use  of  multigrid  increases 
the  storage  required  by  a  factor  of  2^/ (2^“  1),  which  is  4/3  in  two  and  8/7  in  three  dimensions, 
so  that  the  additional  storage  requirement  posed  by  multigrid  seems  modest. 

Next,  suppose  that  semi-coarsening  (cf.  Section  7.3)  is  used  for  the  construction  of  the  coarse 
grids  k  <  K.  Assume  that  in  one  coordinate  direction  the  mesh-size  is  the  same  on  all 
grids.  Then 

Nk  =  (6.3.5) 

and  the  total  amount  of  storage  required  is  given  by 

Cl  _  1  (6.3.6) 

k=i  ^  ^ 

Now  the  total  amount  of  storage  required  by  multigrid  compared  with  single  grid  solution  on 
increases  by  a  factor  2  in  two  and  4/3  in  three  dimensions.  Hence,  in  two  dimensions  the 
storage  cost  associated  with  semi- coarsening  multigrid  is  not  negligible. 
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Computational  work 

We  will  estimate  the  computational  work  of  one  iteration  with  the  fixed  schedule  algorithm 
MG2.  A  close  approximation  of  the  computational  work  to  be  performed  on  Gk  will  be 
Wk  =  C2iVfc,  assuming  the  number  of  pre-  and  post-smoothings  Vk  and  ^ik  are  independent  of 
fc,  and  that  the  operators  are  of  similar  complexity  (for  example,  in  the  linear  case,  are 
matrices  of  equal  sparsity).  More  precisely,  let  us  define  Wk  to  be  all  computing  work  involved 
in  MG2  {u,u,f,k),  except  the  recursive  caU  of  MG2.  Let  Wk  be  all  work  involved  in  MG2 
(u,u,f,k).  Let  jk  =  1,  k  =  2,3, ...,  A  -  1,  in  subroutine  MG2  (e.g.,  the  V-  or  W-cycles). 
Assume  standard  coarsering.  Then 


Wk  =  C2M2^‘'  +  7Wfc_i 

(6.3.7) 

In  [141]  it  is  shown  that  if 

V 

cs 

III 

(6.3.8) 

then 

Wk  <W 

(6.3.9) 

with  Wk  =  WkI{<^2Nk)  • 

The  following  conclusions  may  be  drawn  from  (6.3.10).  Wk  is  the  ration  of  multigrid  work  and 
work  on  the  finest  grid.  The  bulk  of  the  work  on  the  finest  grid  usually  consists  of  smoothing. 
Hence,  Wk  -  1  is  a  measure  of  the  additional  work  required  to  accelerate  smoothing  on  the 
finest  grid  by  means  of  multigrid. 

If  7  >  1  the  work  Wk  is  superlinear  in  the  number  of  unknowns  Nki  see  [141]. 

If  7  <  1  equation  (6.3.9)  gives 

Wk  <  c2Ak/(1  -  7)  (6-3.10) 

so  that  Wk  is  Unear  in  Nk-  It  is  furthermore  significant  that  the  constant  of  proportionaUty 
C2/(l  -  7)  is  small.  This  because  C2  is  just  a  Uttle  greater  than  the  work  per  grid  point  of  the 
smoothing  method,  which  is  supposed  to  be  a  simple  iterative  method  (if  not,  multigrid 
is  not  appUed  in  an  appropriate  way).  Since  an  (perhaps  the  main)  attractive  feaure  of 
multigrid  is  the  possibiUty  to  realize  Unear  computational  complexity  with  small  constant  of 
proportionaUty,  one  chooses  7  <  1,  or  7  <  2'^.  In  practice  it  is  usuaUy  found  that  7  >  2  does 
not  result  in  significantly  faster  convergence.  The  rapid  growth  of  Wk  with  7  means  that  it 
is  advantageous  to  choose  7  <  2,  which  is  why  the  V-  and  W-cycles  are  widely  used. 

The  computational  cost  of  the  F-cycle  may  be  estimated  as  foUows.  In  Figure  6.3.3  the 
diagram  of  the  F-cycle  has  been  redrawn,  distinguishing  between  the  work  that  is  done  on 
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preceding  coarse  grid  correction  (pre-work,  statements  A  in  subroutine  MG2)  and  after 
coarse  grid  correction  (post- work,  statements  B  in  subroutine  MG2).  The  amount  of  pre- 
and  post-work  together  is  C2Af2^^,  as  before.  It  follows  from  the  diagram,  that  on  the 
cost  of  pre-  and  post-work  is  incurred  jk  times,  with  jk  —  K  -  k  +  k  =  2,3, and 
—  1.  For  convenience  we  redefine  ji  =  bearing  our  earlier  remarks  on  the  inaccuracy 
and  unimportance  of  the  estimate  of  the  work  in  G^  in  mind.  One  obtains 

K 

Wk  =  C2M  (6.3.11) 

k=l 


We  have 


K 


/:=! 


2(K+l)d 

(2^  -  1)2 


[K{2‘^  -  1.)  -  1]  + 


2d 

(2<^  -  1)2 


as  is  checked  ecisily.  It  follows  that 


Wk  =  C2M(2'^(^+2)  K2‘^)I{2‘^  -  if 


(6.3.12) 


k 


Figure  6.3.3:  F-cycle  (0  pre-work,  •  post-work). 

so  that 

Wk  <W  =l/{l-2-‘^f  (6.3.13) 

Table  6.3.1  gives  W  as  given  by  (6.3.9)  and  (6.3.13)  for  a  number  of  cases.  The  ratio  of 
multigrid  over  single  grid  work  is  seen  to  be  not  large,  especially  in  three  dimensions.  The 
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d 

2 

3 

V-cycle 

4/3 

8/7 

F-cycle 

16/9 

64/49 

W-cycle 

2 

4/3 

7  =  3 

4 

8/5 

Table  6.3.1:  Values  of  W,  standard  coarsening 

F-cycle  is  not  much  cheaper  than  the  W-cycle.  In  three  dimensions  the  cost  of  the  V-,  F-  and 
W-cycles  is  almost  the  same. 

Suppose  next  that  semi-coarsening  is  used.  Assume  that  in  one  coordinate  direction  the 
mesh-size  is  the  same  on  all  grids.  The  number  of  grid-points  N).  of  is  given  by  (6.3.5). 
With  7fc  =  7,  k  -2, 3, ...» A  -  1  we  obtain 

Wk  =  -h  jWk-i  (6.3.14) 

Hence  Wk  is  given  by  (6.3.8)  and  W  by  (6.3.9)  with  7  =  7/2'^"^  For  the  F-cycle  we  obtain 

K 

Wk  =  C2M2^  Y^{K  -k-\- 1)2''('^-^)  (6.3.15) 

k=\ 


Hence 

Wk  <W  =  1/(1  -  2^-'^f 


d 

2 

3 

V-cycle 

2 

4/3 

F-cycle 

4 

16/9 

W-cycle 

- 

2 

7-3 

- 

4 

Table  6.3.2:  Values  of  W,  semi- coarsening 

Table  6.3.2  gives  W  for  a  number  of  cases.  In  two  dimensions  7  =  2  or  3  is  not  useful,  because 
7  >  1.  It  may  happen  that  the  rate  of  convergence  of  the  V-cycle  is  not  independent  of  the 
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mesh-size,  for  example  if  a  singular  perturbation  problem  is  being  solved  (e.g.  convection- 
diffusion  problem  with  e  <  1),  or  when  the  solution  contains  singularities.  With  the  W-cycle 
we  have  7=1  with  semi- coarsening,  hence  Wk  =  K,  In  practice,  K  is  usually  not  greater 
than  6  or  7,  so  that  the  W-cycle  is  still  affordable.  The  F-cycle  may  be  more  efficient. 

Work  units 

The  ideal  computing  method  to  approximate  the  behaviour  of  a  given  physical  problem  in¬ 
volves  an  amount  of  computing  work  that  is  proportional  to  the  number  and  size  of  the 
physical  changes  that  are  modeled.  This  has  been  put  forward  as  the  ’golden  rule  of  compu¬ 
tation’  by  Brandt  [16].  As  has  been  emphasized  by  Brand  in  a  number  of  publications,  e.g. 
[20],  [21],  [22],  [16],  this  involves  not  only  the  choice  of  methods  to  solve  (6.2.1),  but  also  the 
choice  of  the  mathematical  model  and  its  discretization.  The  discretization  and  solution  pro¬ 
cesses  should  be  interwined,  leading  to  adaptive  disretization.  We  shall  not  discuss  adaptive 
methods  here,  but  regard  (6.2.1)  as  given.  A  practical  measure  of  the  minimum  computing 
work  to  solve  (6.2.1)  is  as  follows.  Let  us  define  one  work  unit  (WU)  as  the  amount  of  comput¬ 
ing  work  required  to  evaluate  the  residual  L^{u^)  —  of  Equation  6.2.1)  on  the  finest  grid 
.  Then  it  is  to  be  expected  that  (6.2.1)  cannot  be  solved  at  a  cost  less  than  few  WU,  and 
one  should  be  content  if  this  is  reabzed.  Many  publications  show  that  this  goal  can  indeed 
be  achieved  with  multigrid  for  significant  physical  problems,  for  example  in  computational 
fluid  dynamics.  In  practice  the  work  involved  in  smoothing  is  by  far  the  dominant  part  of  the 
total  work.  One  may,  therefore,  also  define  one  work  unit,  following  [20],  as  the  work  involved 
in  one  smoothing  iteration  on  the  finest  grid  »  This  agrees  more  or  less  with  the  first 
definition  only  if  the  smoothing  algorithm  is  simple  and  cheap.  As  was  already  mentioned,  if 
this  is  not  the  case  multigrid  is  not  applied  in  an  appropriate  way.  One  smoothing  iteration 
on  G^  then  adds  WU  to  the  total  work.  It  is  a  good  habit,  followed  by  many  authors, 

to  publish  convergence  histories  in  terms  of  work  units.  This  facilitaties  comparisons  between 
methods,  and  helps  in  developing  and  improving  multigrid  codes. 

6,4  Nested  iteration 
The  algorithm 

Nested  iteration^  also  called  full  multigrid  (FMG,  [22],  [16])  is  based  on  the  following  idea. 
When  no  a  priori  information  about  the  solution  is  available  to  assist  in  the  choice  of  the 
initial  guess  on  the  finest  grid  it  is  obviously  wasteful  to  start  the  computation  on 
the  finest  grid,  as  is  done  by  subroutines  MGi,  i  =  1,2,3  of  the  preceding  section.  With 
an  unfortunate  choice  of  the  initial  the  algorithm  might  even  diverge  for  a  nonlinear 
problem.  Computing  on  the  coarse  grids  is  so  much  cheaper,  thus  it  is  better  to  use  the 
coarse  grids  to  provide  an  informed  guess  for  u^.  At  the  same  time,  this  gives  us  a  choice 
for  k  <  K,  Nested  iteration  is  defined  by  the  following  algorithm. 
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Program  1 

comment  nested  iteration  algorithm 
Choose 

(1)  S{u,u,f,-A) 

for  k  =  2  step  1  until  K  do 

(2)  =  u*  = 

for  i  =  1  step  1  until  7^  do 

(3)  MG  {it, u,f,k) 
od 

od 

Of  course,  the  value  of  jk  inside  MG  may  he  different  from  %. 

Choice  of  prolongation  operator 

The  prolongation  operator  does  not  need  to  be  identical  to  P^.  In  fact,  there  may  be 
good  reason  to  choose  it  differently.  As  discussed  in  [57]  (for  a  simplified  analysis  see  [141]), 

it  is  often  advisable  to  choose  P  such  that 

mp  >  me  (6.4.1) 

where  mp  is  the  order  of  the  prolongation  operator  as  defined  in  Section  5.3,  and  me  is  the 
order  of  consistency  of  the  discretizations  L'^,  here  assumed  to  be  the  same  on  aU  grids.  Of¬ 
ten  me  =  2  (second-order  schemes).  Then  (6.4.1)  implies  that  P  is  exact  for  second-order 
polynomials. 

Note  that  nested  iteration  provides  this  is  an  alternative  to  (6.2.5). 

As  discussed  in  [57]  and  [141],  if  MG  converges  well  then  the  nested  iteration  algorithm 
results  in  a  which  differs  from  the  solution  of  (6.2.1)  by  an  amount  of  the  order  of  the 
truncation  error.  If  one  desires,  the  accuracy  of  may  be  improved  further  by  following 
the  nested  iteration  algorithm  with  a  few  more  multigrid  iterations. 

Computational  cost  of  nested  iteration 

Let  7fc  =  7,  k  =  2, 3, ...,  K,  in  the  nested  iteration  algorithm,  let  Wk  be  the  work  involved  in 
MG  (u,u,  f,k),  and  assume  for  simplicity  that  the  (negligible)  work  on  equals  Wi.  Then 

the  computational  work  Wni  of  the  nested  iteration  algorithm,  neglecting  the  cost  of  F  ,  is 
given  by 

Wni  =  j^Wk  (6.4.2) 

Jt=i 
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Assume  inside  Mg  7*;  =  7,  A:  =  2,3,  ...,K  and  let  7  =  7/2.'^  <  1.  Note  that  7  and  7  may  be 
different.  Then  it  follows  from  (6.3.9)  that 


K 


C2I 


Nk 


(6.4.3) 


Defining  a  work  unit  as  1  WU  =  C2Nk^  i-6.  approximately  the  work  of  {y  +  y)  smoothing 
iterations  on  the  finest  grid,  the  cost  of  a  nested  iteration  is 


W^ni  =  7/[(l-7)(l-2-‘')]WU  (6.4.4) 

Table  6.4.1  gives  the  number  of  work  units  required  for  nested  iteration  for  a  number  of  cases. 
The  cost  of  nested  iteration  is  seen  to  be  just  a  few  work  units.  Hence  the  fundamental 
property,  which  makes  multigrid  methods  so  attractive:  multigrid  methods  can  solve  many 
problems  to  within  truncation  error  at  a  cost  of  cN  arithmetic  operations.  Here  N  is  the 
number  of  unknowns,  and  c  is  a  constant  which  depends  on  the  problem  and  on  the  multigrid 
method  (choice  of  smoothing  method  and  of  the  parameters  Uk^p-kylk)-  If  the  cost  of  the 
residual  —  L^{u^)  is  bN,  then  c  need  not  be  larger  than  a  small  multiple  of  b.  Other 
numerical  methods  for  elliptic  equations  require  0(1V“)  operations  with  a  >  1,  achieving 
0{N\n  N)  only  in  special  cases  (e.g.  separable  equations).  A  class  of  methods  which  is  com¬ 
petitive  with  multigrid  for  linear  problems  in  practice  are  preconditioned  conjugate  gradient 
methods.  Practice  and  theory  (for  special  cases)  indicate  that  these  require  0{N°‘)  opera¬ 
tions,  with  a  =  5/4  in  two  and  a  =  9/8  in  three  dimensions.  Comparisons  will  be  given  later. 


d 

7  2 

3 

1  16/9 

2  8/3 

64/49 

48/21 

Table  6.4.1:  Computational  cost  of  nested  iteration  in  work  units;  7  =  1 


6.5  Non-recursive  formulation  of  the  basic  multigrid  algorithm 
Structure  diagram  for  fixed  multigrid  schedule 

In  FORTRAN,  resursion  is  not  allowed:  a  subroutine  cannot  call  itself.  The  subroutines  MGl, 
2,  3  of  Section  6.3  cannot,  therefore,  be  implemented  directly  in  FORTRAN.  A  non-recursive 
version  wiU,  therefore,  be  presented.  At  the  same  time,  we  will  allow  grater  flexibility  in  the 
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decision  whether  to  go  to  a  finer  or  to  a  coarses  grid. 

Various  flow  diagrams  describing  non-recursive  multigrid  algorithms  have  been  published, 
for  example  in  [20]  and  [57].  In  order  to  arrive  at  a  well  structured  program,  we  begin  by 
presenting  a  structure  diagram.  A  structure  diagram  allows  much  less  freedom  in  the  design 
of  the  control  structure  of  an  algorithm  than  a  flow  diagram.  We  found  basically  only  one 
way  to  represent  the  multigrid  algorithm  in  a  structure  diagram  ([134],  [140]).  This  structure 
diagram  might,  therefore,  be  called  the  canonical  form  of  the  basic  multigrid  algorithm.  The 
structure  diagram  is  given  in  Figure  6.5.1.  This  diagram  is  equivalent  to  Program  2  calling 
MG2  to  do  nmg  multigrid  iterations  with  finest  grid  in  Section  6.3.  The  schedule  is  fixed 
and  includes  the  V-,  W-  and  F-cycles.  Parts  A  and  B  are  specified  after  subroutine  MG2  in 
Section  6.3.  Care  has  been  taken  that  the  program  also  works  as  a  single  grid  method  for 
K  =  \. 

FORTRAN  implementation  of  while  clause 

Apart  from  the  while  clause,  the  structure  diagram  of  Figure  6.5.1  can  be  expressed  directly 
in  FORTRAN.  A  FORTRAN  implementation  of  a  while  clause  is  as  follows.  Suppose  we 
have  the  following  program 

while  {n{K)  >  0)  do 
Statement  1 
n{K)  =  ... 

Statement  2 
od 

A  FORTRAN  version  of  this  program  is 

10  if  {ri{K)  >  0)  then 
Statement  1 
niK)  =  ... 

Statement  2 
goto  10 
endif 
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Choose  and  7 

comment  7  =  1  :  V-cycle  :  7  ==  2  5  W-cycle 
;  k  =  K  ]  UK  =  nmg 
if  (cycle  eq  F)  then  7  =  2  endif 
while  {riK  >  0)  do 


Figure  6.5.1:  Structure  diagram  of  non-recursive  multigrid  algorithm  with  fixed  schedule, 
including  V-,  W-  and  F-cycles. 
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The  goto  statement  required  for  the  FORTRAN  version  of  the  while  clause  is  the  only  goto 
needed  in  the  FORTRAN  implementation  of  the  structure  diagram  of  Figure  6.5.1.  This 
FORTRAN  implementation  is  quite  obvious,  and  will  not  be  given. 

Testing  of  multigrid  software 

A  simple  way  to  test  whether  a  multigrid  algorithm  is  functioning  properly  is  to  measure  the 
residual  before  and  after  each  smoothing  operation,  and  before  and  after  each  visit  coarser 
grids.  If  a  significant  reduction  of  the  size  of  the  residual  is  not  found,  then  the  relevant 
part  of  the  algorithm  (smoothing  or  coarse  grid  correction)  is  not  functioning  proporly.  For 
simple  test  problems  predictions  by  Fourier  smoothing  analysis  and  the  contraction  number 
of  the  multigrid  method  should  be  correlated.  If  the  coarse  grid  problem  is  solved  exactly  (a 
situation  usually  appro3dmately  realized  with  the  W-cycle)  the  multigrid  contraction  number 
should  usually  be  approximately  equal  to  the  smoothing  factor. 

Local  smoothing 

It  may,  however,  happen,  happen  that  for  a  weU  designed  multigrid  algorithm  the  contraction 
number  is  significantly  worse  than  predicted  by  the  smoothing  factor.  This  may  be  caused 
by  the  fact  that  Fourier  smoothing  analysis  is  locally  not  applicable.  The  cause  may  be  a 
local  singularity  in  the  solution.  This  occurs  for  example  when  the  physical  domain  has  a 
reentrant  corner  The  coordinate  mapping  from  the  physical  domain  onto  the  computational 
rectangle  is  singular  at  that  point.  It  may  well  be  that  the  the  smoothing  method  does  not 
reduce  the  residual  sufficiently  in  the  neighbourhood  of  this  singularity,  a  fact  that  does  not 
remain  undetected  if  the  testing  procedures  recommended  above  are  applied.  The  remedy  is 
to  apply  additional  local  smoothing  in  a  small  number  of  points  in  the  neighbourhood  of  the 
singularity.  This  procedure  is  recommended  in  [16],  [17],  [18],  [9],  and  justified  theoretically 
in  [110]  and  [24].  This  local  smoothing  is  applied  only  to  a  small  number  of  points,  thus  the 
computing  work  involved  is  negligible. 

6.6  Remarks  on  software 

Multigrid  software  development  can  be  approached  in  various  ways,  two  of  which  will  be 
examined  here. 

The  first  approach  is  to  develop  general  building  blocks  and  diagnostic  tools,  which  helps 
users  to  develop  their  own  software  for  particular  applications  without  having  to  start  from 
scratch,  users  wiU,  therefore,  need  a  basic  knowledge  of  multigrid  methods.  Such  software 
tool  are  described  in  [26]. 

The  second  approach  is  to  develop  autonomous  (black  box)  programs,  for  which  the  user 
has  to  specify  only  the  problem  on  the  finest  grid.  A  program  or  subroutine  may  be  called 
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autonomous  if  it  does  not  require  any  additional  input  from  the  user  apart  from  problem  spec¬ 
ification,  consisting  of  the  linear  discrete  system  of  equations  to  be  solved  and  the  right-hand 
side.  The  user  does  not  need  to  know  anything  about  multigrid  methods.  The  subroutine  is 
perceived  by  the  user  as  if  it  were  just  another  linear  algebra  solution  method.  This  approach 
is  adopted  by  the  MGD  codes  [133],  [67],  [69],  [68],  [107],  [108],  which  are  available  in  the 
NAG  library,  and  by  the  MGCS  code  [36]. 

Of  course,  it  is  possible  to  steer  a  middle  course  between  the  two  approaches  just  outlined, 
allowing  or  requiring  the  user  to  specify  details  about  the  multigrid  method  to  be  used,  such 
as  offering  a  selection  of  smoothing  methods,  for  example.  Programs  developed  in  this  vein 
are  BOXMG  [37],  [38],  [39],  the  MGOO  series  of  codes  [45],  [44],  [113]  which  is  available 
in  ELLPACK  [100],  MUDPACK  [3],  [2],  and  the  PLTMG  code  [11],  [10],  [12].  Exept  for 
PLTMG  and  MGD,  the  user  specifies  the  linear  differential  equation  to  be  solved  and  the 
program  generates  a  finite  difference  discretization.  PLTMG  generates  adaptive  finite  ele¬ 
ment  disretizations  of  non-linear  equations,  and  therefore  has  a  much  wider  scope  then  the 
other  packages.  As  a  consequence,  it  is  not  (meant  to  be)  a  solver  as  fast  as  the  other  methods. 

By  sacrificing  generality  for  efficiency  very  fast  multigrid  methods  can  be  obtained  for 
special  problems,  such  as  the  Poisson  or  the  Helmholtz  equation.  In  MGOO  this  can  be  done 
by  setting  certain  parameters.  A  very  fast  multigrid  code  for  the  Poisson  equation  has  been 
developed  in  [13].  This  is  probably  the  fastest  two-dimensional  Poisson  solver  in  existence. 

If  one  wants  to  emulate  a  linear  algebraic  systems  solver,  with  only  the  fine  grid  matrix  and 
right-hand  side  suplied  by  the  user,  then  the  se  of  coarse  grid  GaJerkin  appoximation  (Chap¬ 
ter  5)  is  mandatory.  Coarse  grid  Galerkin  approximation  is  also  required  if  the  coefficients 
in  the  differential  equations  are  discontinuous.  Coarse  grid  Galerkin  approximation  is  used 
in  MGD,  MGCS  and  BOXMG;  the  last  two  codes  use  operator-dependent  transfer  operators 
and  are  applicable  to  problems  with  discontinuous  coefficients. 

In  an  autonomous  subroutine  the  method  cannot  be  adapted  to  the  problem,  so  that  user 
expertise  is  not  required.  The  method  must,  therefore,  be  very  robust.  If  one  of  the  smoothers 
that  were  fund  to  be  robust  in  Chapter  4  is  used,  the  required  degree  of  robustness  is  indeed 
obtained  for  linear  problems. 

Non-linear  problems  may  be  solved  with  multigrid  codes  for  linear  problems  in  various 
ways.  The  problem  may  be  linearized  and  solved  iteratively,  for  example  by  Newton  method. 
This  works  well  as  long  as  the  Jacobian  of  the  non-linear  discrete  problem  is  non-singular. 
It  may  well  happen,  however,  that  the  given  continuous  problem  has  no  Frechet  derivative. 
In  this  case  the  condition  of  the  Jacobian  deteriorates  as  the  grid  is  refined,  and  the  Newton 
method  does  not  converge  rapidly  or  not  at  aU.  An  example  of  this  situation  is  given  [96], 
[95].  The  non-linear  multigrid  method  can  be  used  safely  and  efficiently,  because  the  global 
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system  is  not  linearized.  A  systematic  way  of  applying  numerical  software  outside  the  class 
of  problems  to  which  the  software  is  directly  applicable  is  the  defect  correction  approach.  In 
[5]  and  [15]  it  is  pointed  out  how  this  ties  in  with  multigrid  methods. 

Much  software  is  made  available  on  MGNet. 

6.7  Comparison  with  conjugate  gradient  methods 

Although  the  scope  and  applicability  of  multigrid  principles  are  much  broader,  multigrid 
methods  can  be  regarded  as  very  efficient  ways  to  solve  linear  systems  arising  from  discretiza¬ 
tion  of  partial  differential  equations.  As  such  multigrid  can  be  viewed  as  a  technique  to 
accelerate  the  convergence  of  basic  iterative  methods  (called  smoothers  in  the  multigrid  con¬ 
text).  Another  powerful  technique  to  accelerate  basic  iterative  methods  for  linear  problems 
that  also  has  come  to  fruition  relatively  recently  is  provided  by  conjugate  gradient  and  re¬ 
lated  methods.  For  an  introduction  to  conjugate  gradient  acceleration  of  iterative  methods, 
see  [63],  [48]  or  (for  a  very  brief  synopsis)  [141]. 

It  is  surprising  that,  although  the  algorithm  is  much  simpler,  the  rate  of  convergence  of 
conjugate  gradient  methods  is  harder  to  estimate  theoretically  than  for  multigrid  methods.  In 
two  dimensions,  computational  complexity,  and  probably  in  three  dimen¬ 

sions  seems  to  hold  approximately  quite  generally  for  conjugate  gradient  methods  precondi¬ 
tioned  by  approximate  factorization,  which  comes  close  to  the  0(N)  of  multigrid  methods. 

Conjugate  gradient  acceleration  of  multigrid 

The  conjugate  gradient  method  can  be  used  to  accelerate  any  iterative  method,  including 
multigrid  methods.  If  the  multigrid  algorithm  is  well  designed  and  fits  the  problem  it  will 
converge  fast,  making  conjugate  gradient  acceleration  superfluous  or  even  wasteful.  If  multi¬ 
grid  does  not  converge  fast  one  may  try  to  remedy  this  by  improving  the  algorithm  (for 
example,  introducing  additional  local  smoothing  near  singularities,  or  adapting  the  smoother 
to  the  problem),  but  if  this  is  impossible  because  an  autonomous  (black  box)  multigrid  code  is 
used,  or  difficult  because  one  cannot  identify  the  cause  of  the  trouble,  then  conjugate  gradient 
acceleration  is  an  easy  and  often  very  efficient  way  out. 

The  non-symmetric  case 

A  severe  limitation  of  conjugate  gradient  methods  is  their  restriction  to  linear  systems  with 
symmetric  positive  definite  matrices.  A  number  of  conjugate  gradient  type  methods  have 
been  proposed  that  are  applicable  to  the  non-symmetric  case.  Although  no  theoretical  es¬ 
timates  are  available,  their  rate  of  convergence  is  often  satisfactory  in  practice.  Two  such 
methods  are  CGS  (conjugate  gradient  squared),  described  in  [107],  [108],  [105]  and  [141],  and 
GMRES,  described  in  [102],  [121],  [120],  [131],  [132].  Good  convergence  is  expected  if  the 
eigen  eigenvalues  of  A  have  positive  real  part,  cf.  the  remarks  on  convergence  in  [105]. 
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Comparison  of  conjugate  gradient  and  multigrid  niethods 

Realistic  estimates  of  the  performance  in  practice  of  conjugate  gradient  and  multigrid  methods 
by  purely  theoretical  means  are  possible  only  for  very  simple  problems.  Therefore  numerical 
experiments  are  necessary  to  obtain  insight  and  confidence  in  the  efficiency  and  robustness 
of  a  particular  method.  Numerical  experiments  can  be  used  only  to  rule  out  methods  that 
fail,  not  to  guarantee  good  performance  of  a  method  for  problems  that  have  not  yet  been 
attempted.  Nevertheless,  one  strives  to  build  up  confidence  by  carefully  choosing  tests  prob¬ 
lems,  trying  to  make  them  representative  for  large  classes  of  problems,  taking  into  account 
the  nature  of  the  mathematical  models  that  occur  in  the  field  of  application  that  one  has  in 
mind.  For  the  development  of  conjugate  gradient  and  multigrid  methods,  in  particular  the 
subject  areas  of  computational  fluid  dynamics,  petroleum  reservoir  engineering  and  neutron 
diffusion  are  pace-setting. 

Important  constant  coefficient  test  problems  are  (4.5.3)  and  (4.5.4),  Problems  with  con¬ 
stant  coefficients  are  thought  to  be  representative  of  problems  with  smoothly  varying  coeffi¬ 
cients.  Of  course,  in  the  code  to  be  tested  the  fact  that  the  coefficients  are  constant  should 
not  be  exploited.  As  pointed  out  in  [35],  one  should  keep  in  mind  that  for  constant  coefficient 
problems  the  spectrum  of  the  matrix  resulting  from  discretization  can  have  very  special  prop¬ 
erties,  that  are  not  present  when  the  coefficients  are  variable.  Therefore  one  should  also  carry 
out  tests  with  variable  coefficients,  especially  with  conjugate  gradient  methods,  for  which  the 
properties  of  the  spectrum  are  very  important.  For  multigrid  methods,  constant  coefficient 
test  problems  are  often  more  demanding  than  variable  coefficient  problems,  because  it  may 
happen  that  the  smoothing  process  is  not  effective  for  certain  combinations  of  e  and  /?.  This 
fact  goes  easily  unnoticed  with  variable  coefficients,  where  the  unfavourable  values  of  e  and 
/?  perhaps  occur  only  in  a  small  part  of  the  domain. 

In  petroleum  reservoir  engineering  and  neutron  diffusion  problems  quite  often  equations 
with  strongly  discontinuous  coefficients  appear.  For  these  problems  equations  (4.5.3)  and 
(4.5.4)  are  not  representative.  Suitable  test  problems  with  strongly  discontinuous  coefficients 
have  been  proposed  in  [111]  and  [79];  a  definition  of  these  test  problems  may  also  be  found 
in  [80].  In  Kershaw’s  problem  the  domain  is  non-rect angular,  but  is  a  rectangular  polygon. 
The  matrix  for  both  problems  is  symmetric  positive  definite.  With  vertex- centered  multigrid, 
operator- depen  dent  transfer  operators  have  to  be  used,  of  course. 

The  four  test  problems  just  mentioned,  i.e.  (4.5.3),  (4.5.4)  and  the  problems  of  Stone 
and  Kershaw,  are  gaining  acceptance  among  conjugate  gradient  and  multigrid  practitioners 
as  standard  test  problems.  Given  these  test  problems,  the  dilemma  of  robustness  versus 
efficiency  presents  itself.  Should  one  try  to  devise  a  single  code  to  handle  aU  problems  (ro¬ 
bustness),  or  develop  codes  that  handle  only  a  subset,  but  do  so  more  efficiently  than  a  robust 
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code?  This  dilemma  is  not  novel,  and  just  as  in  other  parts  of  numerical  mathematics,  we 
expect  that  both  approaches  will  be  fruitful,  and  no  single  ’best’  code  will  emerge. 

Numerical  experiments  for  the  test  problems  of  Stone  and  Kershaw  and  equations  (4.5.3) 
and  (4.5.4),  comparing  CGS  and  multigrid,  are  described  in  [107],  using  ILU  and  IBLU  pre¬ 
conditioning  and  smoothing.  As  expected,  the  rate  of  convergence  of  multigrid  is  unaffected 
when  the  mesh  size  is  decreased,  whereas  CGS  slow  down.  On  a  65  x  65  grid  there  is  not  great 
difference  in  efficiency.  Another  comparison  of  conjugate  gradients  and  multigrid  is  presented 
in  [40].  Robustness  and  efficiency  of  conjugate  gradient  and  multigrid  methods  are  deter¬ 
mined  to  a  large  extent  by  the  preconditioning  and  the  smoothing  method  respectively.  The 
smoothing  methods  that  were  found  to  be  robust  on  the  basis  of  Fourier  smoothing  analysis  in 
Chapter  4  suffice,  also  as  preconditioners.  It  may  be  concluded  that  for  medium-sized  linear 
problems  conjugate  gradient  methods  are  about  equally  efficient  as  multigrid  in  accelerating 
basic  iterative  methods.  As  such  they  are  limited  to  linear  problems,  unlike  multigrid.  On 
the  other  hand,  conjugate  gradient  methods  are  much  easier  to  program,  especially  when  the 
computational  grid  is  non-rectangular. 


7  Finite  volume  discretization 

7.1  Introduction 

In  this  chapter  some  essentials  of  finite  volume  discretization  of  partial  differential  equations 
are  summarised.  For  a  more  complete  elementary  introduction,  see  for  example  [46]  or  [89]. 
We  will  pay  special  attention  to  the  handling  of  discontinuous  coefficients,  because  there  seem 
to  be  no  texts  giving  a  comprehensive  account  of  discretization  methods  for  this  situation. 
Discontinuous  coefficients  arise  in  important  application  areas,  such  as  porous  media  flows 
(reservoir  engineering),  and  require  special  treatment  in  the  multigrid  context.  Furthermore, 
hyperbolic  systems  wiU  be  briefly  discussed. 

7.2  An  elliptic  equation 

Cartesian  tensor  notation  is  used  with  convectionaJ  summation  over  repeated  Greek  subscripts 
(not  over  Latin  subscripts).  Greek  subscripts  stand  for  dimension  indices  and  have  range  1,  2, 
...,  d  with  d  the  number  of  space  dimensions.  The  subscript  denotes  the  partial  derivative 
with  respect  to  Xa- 

The  general  single  second-order  elliptic  equation  can  be  written  as 

Lu  =  -{aa0U,a),l3  +  (&aW),a  +  cu  =  s  in  D  C  (7.2.1) 


112 


The  diffusion  tensor  is  assumed  to  be  symmetric:  aap  =  apa-  The  boundary  conditions 
will  be  discussed  later.  Uniform  ellipticity  is  assumed:  there  exists  a  constant  C  >  0  such 
that 

UapVaVp  >  CVaVa,  Vu  G  (7.2.2) 

For  d  =  2  this  is  equivalent  to  Equation  (7.2.9). 

The  domain  0 

The  domain  ft  is  taken  to  be  the  d-dimensionaJ  unit  cube.  This  greatly  simplifies  the  con¬ 
struction  of  the  various  grids  and  the  transfer  operators  between  them,  used  in  multigrid. 
In  practice,  multigrid  for  finite  difference  and  finite  volume  discretization  can  in  principe  be 
applied  to  more  general  domains,  but  the  description  of  the  method  becomes  complicated, 
and  general  domains  will  not  be  discussed  here.  This  is  not  a  serious  limitation,  because  the 
current  main  trend  in  grid  generation  consists  of  decomposition  of  the  physical  domain  in 
subdomains,  each  of  which  is  mapped  onto  a  cubic  computational  domain.  In  general,  such 
mappings  change  the  coefficients  in  (7.2.1).  As  a  result,  special  properties,  such  eis  separa¬ 
bility  or  the  coefficients  being  constant,  may  be  lost,  but  this  does  not  seriously  hamper  the 
application  of  multigrid,  because  this  approach  is  applicable  to  (7.2.1)  in  its  general  form. 
This  is  one  of  the  strengths  of  multigrid  as  compared  with  older  methods. 

The  weak  formulation 

Assume  that  a  is  discontinuous  along  some  manifold  F  C  ft,  which  we  will  call  an  interface] 
then  Equation  (7.2.1)  now  has  to  be  interpreted  in  the  weak  sense,  as  follows.  From  (7.2.1) 
it  follows  that 

(Lu,v)  =  {s,v)  Vu  €  fl,  {u,v)=  I  uvdCl  (7.2.3) 

where  H  is  a,  suitable  Sobolev  space.  Define 

a(u,v)  =  J  Uapu^a'^^pdil  —  f  aapu^anpvdT 

U  dO.  17  2  41 

h{u,v)  =  f(l>an),aVdQ  ^  ^ 

n 

with  np  the  xp  component  of  the  outward  unit  normal  on  the  boundary  dQ  of  ft.  Application 
of  the  Gauss  divergence  theorem  gives 

(Lu,  v)  =  a{u,  v)  -I-  b(u,  v)  -I-  {cu,  v)  (7.2.5) 

The  weak  formulation  of  (7.2.1)  is 

Find  u£H  such  that  a{u,v)  +  b{u,v)  +  {cu,v)  =  {s,v),  €  H  (7.2.6) 
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For  suitable  choices  of  if,  H  and  boundary  conditions,  existence  and  uniqueness  of  the  solution 
of  (7.2.6)  has  been  established.  For  more  details  on  the  weak  formulation  (not  needed  here), 
see  for  example  [31]  or  [58]. 

The  jump  condition 

Consider  the  case  with  one  interface  F,  which  divides  il  in  two  parts  fij  and  in  each  of 
which  Ca/j  is  continuous.  At  F,  da/six)  is  discontinuous.  Let  indices  1  and  2  denote  quantities 
on  F  at  the  side  of  Oi  and  1)2,  respectively.  Application  of  the  Gauss  divergence  theorem  to 
(7.2.5)  gives,  if  u  is  smooth  enough  in  and  1)^, 

a(u,  v)  =  -  J  {aapu^a),pvdQ,  +  J (7.2.7) 
n\r  r 

Hence,  the  solution  of  (7.2.6),  if  it  is  smooth  enough  in  fii  and  D2,  satisfies  (7.2.1)  in  fl  \  F, 
together  with  the  following  jump  condition  on  the  interface  F 

=  al^u^^n}  on  F  (7.2.8) 

This  means  that  where  a^js  is  discontinuous,  so  is  This  has  to  be  taken  into  account  in 
constructing  discrete  approximations. 

Exercise  3.2A.  Show  that  in  two  dimensions  Equation  (7.2.2)  is  equivalent  to 

^11^22  ~  ^  0  (7.2.9) 

7.3  A  one-dimensional  example 

The  basic  ideas  of  finite  difference  and  finite  volume  discretization  taking  discontinuities  in 
actfs  into  account  wiD  be  explained  for  the  following  example 

—  (an^i)^i  =5,  X  G  fl  =  (0, 1)  (7.3.1) 

Boundary  conditions  will  be  given  later. 

Finite  difference  discretization 

A  computational  grid  G  C  fl  is  defined  by 

G  =  {x  e  M  :  X  =  Xj  =  jh,  j  =  0, 1,2,  ...,n,  h  =  1/n}  (7.3.2) 

Forward  and  backward  difference  operators  are  defined  by 

Auj  =  {uj+1 -Uj)/h,  Vuj  =  (7.3.3) 
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A  finite  difference  approximation  of  (7.3.1)  is  obtained  by  replacing  d/dx  by  A  or  V.  A  nice 
symmetric  formula  is 


-  -{V(aA)  + A(aV)}uj  =  Sj,  j  =  1,2, ...,n-  1 


(7.3.4) 


where  Sj  =  Uj  is  the  numerical  approximation  of  u{xj).  Written  out  in  full, 

Equation  (7.3.4)  gives 


{-{aj-i  +  aj)uj-i  +  (aj_i  +  2aj  +  aj^i)uj  -  {aj  +  aj^i)uj^i}/2h^  —  Sj, 

j  =  l,2,...,n-  1 


(7.3.5) 


If  the  boundary  condition  at  x  =  0  is  u(0)  =  /  (Dirichlet),  we  elimine  uq  from  (7.3.5)  with 
uq  =  /.  If  the  boundary  condition  is  a(0)u^i(0)  =  /  (Neumann),  we  write  down  (7.3.5) 
for  j  =  0  and  replace  the  quantity  — (a_i  +  ao)u_i  +  (a_i  +  ao)^o  by  2/.  If  the  boundary 
condition  is  ciUi(O)  +  C2u{{^)  —  f  (Robin),  we  again  write  down  (7.3.5)  for  j  =  0,  and  replace 
the  quantity  just  mentioned  by  2(/  —  C2Uo)a(0)/ci.  The  boundary  condition  at  x  =  1  is 
handled  in  a  similar  way. 

An  interface  problem 

In  order  to  show  that  (7.3.4)  can  be  inaccurate  for  interface  problems,  we  consider  the  following 
example 

a(x)  =  e,  0  <  X  <  X*,  a(x)  =  1,  x*  <  x  <  1  (7.3.6) 

The  boundary  conditions  are:  w(0)  =  0,  ^i(l)  =  1.  The  jump  condition  (7.2.8)  becomes 


e  lim  =  lim 

x'\x*  ’  xlx*  ’ 


(7.3.7) 


By  postulating  a  piecewise  linear  solution  the  solution  of  (7.3.1)  and  (7.3.7)  is  found  to  be 


u  =  ax,  0  <  a;  <  X*,  u  =  eax  +  1  —  ea,  x*  <  x  <  1, 
a  =  1 ! {x*  -  ex*  e) 

Assume  Xk  <  x*  <  Xk+\.  By  postulation  a  piecewise  linear  solution 

Uj  -  aj,  0<  j  <k,  Uj  =  /3j  -  /3n  +  1,  k  +  1  <  j  <  n 


(7.3.8) 


(7.3.9) 


one  finds  that  the  solution  of  (7.3.5),  with  the  boundary  conditions  given  above,  is  given  by 
(7.3.9)  with 

/3  =  ea,  a  =  (s  ^  ^ — h  £(ra  —  fc)  +  A;)  (7.3.10) 


Hence 


eh(l  -  e)/{l  +  £)  +  (!-  s)xk  +  e 


(7.3.11) 
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(7.3.12) 


Let  X*  =  ajfc+i.  The  exact  solution  in  Xk  is 


u{xk) 


(1  —  s)xk+i  +  £ 


Hence,  the  error  satisfies 


Uk  -  u{xk)  =  0 


(7.3.13) 


As  another  example,  let  x*  —  Xk-\-hf2.  The  numerical  solution  in  Xk  is  stiU  given  by  (7.3.11). 
The  exact  solution  in  ajfc  is 


u{xk)  = 


(1  —  s)xk  +  £  +  h'(l  —  £)/2 


The  error  in  Xk  satisfies 


Uk  -  u{xk)  =  0 


(1  -  ef 


(7.3.14) 


(7.3.15) 


When  a(a:)  is  continuous  (£  =  1)  the  error  is  zero.  For  general  continuous  a{x)  the  error  is 
0{h'^).  When  a{x)  is  discontinuous,  the  error  of  (7.3.4)  increases  to  0{h). 

Finite  volume  discretization 

By  starting  from  the  weak  formulation  (7.2.6)  and  using  finite  volume  discretization  one  may 
obtain  0{h?)  accuracy  for  discontinuous  a{x).  The  domain  is  (almost)  covered  by  cells  or 
finite  volumes  fij, 

=  {xj  -  hl2,Xj  +  hl2),  j  =  l,2,...,n-l  (7.3.16) 

Let  v{x)  be  the  characteristic  function  of 


v{x)  =  0,  X  ^  ftj;  v(x)  =  1,  X  E  % 


(7.3.17) 


A  convenient  unified  treatment  of  both  cases:  a(x)  continuous  and  a(x)  discontinuous,  is  as 
follows.  We  approximate  a(x)  by  a  piecewise  constant  function  that  has  a  constant  value  aj 
in  each  %.  Of  course,  this  works  best  if  discontinuities  of  a{x)  lie  at  boundaries  of  finite 
volumes  Ctj.  One  may  take  aj  =  a{xj),  or 


nj  —  h  j 


adCl  . 


With  this  approximation  of  a{x)  and  v  according  to  (7.3.17)  one  obtains  from  (7.2.7) 

a(u,v)  =  —  J  {au^i)^\dQ 


=  -““.1  C'-VJ  if 


(7.3.18) 
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By  taking  successively  j  =  1,2,  ..,n-  1,  Equation  (7.2.6) leads  to  n  —  \  equations  for  the  n—\ 
unknowns  Uj{uo  =  0  and  =  1  are  given),  after  making  further  approximations  in  (7.3.18). 

In  order  to  approximate  au^\{xj  +  /i/2)  we  proceed  as  follows.  Because  au^i  is  smooth, 
wj(xj  +  /i/2)  does  not  exist  if  a{x)  jumps  x  =  Xj  +  h/2.  Hence,  it  is  a  bad  idea  to  discretize 
+  h/2).  Instead,  we  write 

Xj^l  Xj^l  Xj^i 

=  J  u^idx  =  J  -au  idx  =  (aui)j+i/2  J  -^dx  (7.3.19) 

Xj  Xj  Xj 

where  we  have  exploited  the  smoothness  of  au^i .  We  have 


J  -dx  =  hjx 


(7.3.20) 


with  Wj  the  harmonic  average  of  aj  and  aj^i : 


Wj  —  2ajaj^if[aj  +  Oj+i) 


and  we  obtain  the  following  approximation: 


(«W,l)j+l/2  -  Wj(Wj+l  -  Uj)/h 


(7.3.21) 


(7.3.22) 


With  equations  (7.3.18)  and  (7.3.22),  the  weak  formulation  (7.2.6)  leads  to  the  following 
discretization: 


Wj-i(uj  —  Uj-i)/h—  Wj(uj^i  -  Uj)/h  =  hsj,  j  =  l,2,...,n—  1 


(7.3.23) 


.ft-./. 


When  a(x)  is  smooth,  Wj  «  (aj  +  aj+i)/2,  and  we  recover  the  finite  difference  approximation 
(7.3.5). 

Equation  (7.3.23)  can  be  solved  in  a  similar  way  as  (7.3.5)  for  the  interface  problem  under 
consideration.  Assume  x*  =  Xk  +  h/2.  Hence 


Wj  =  £,  I  <  j  <  k;  Wk  =  2£:/(1  +  £);  Wj  —  1,  k  <  j  <  n  -  1 
Again  postulating  a  solution  as  in  (7.3.9)  one  finds 

13  =  ae,  a  =  w/[e  —  we{k  +  1  —  n)  +  wk] 


(7.3.24) 


(7.3.25) 


117 


or 


a  =  [{l-  e)f2  +  e{n  -k)  +  k]-^  =  h/[{xk  +  h/2){l  -e)  +  e]  (7.3.26) 

Comparison  with  (7.3.8)  shows  that  Uj  =  u{xj):  the  numerical  error  is  zero.  In  more  general 
circumstances  the  error  wiU  be  O(h^).  Hence,  finite  volume  discretization  is  more  accurate 
than  finite  difference  discretization  for  interface  problems. 

Exercise  3.3.1  The  discrete  maximum  and  I2  norms  are  defined  by,  respectively, 

(7.3.27) 

Estimate  the  error  in  the  numerical  solution  given  by  (7.3.9)  in  these  norms. 

7.4  Cell-centered  discretization  in  two  dimensions 
Cell-centered  grid 

The  domain  ft  is  divided  in  cells  as  before,  but  now  the  grid  points  are  the  centers  of  the 
cells,  see  Figure  7.4.1.  The  computational  grid  G  is  defined  by 

G  =  {x  e  9, :  X  =  xj  =  {j  -  s)h,  j  =  (ji,j2),  -5  =  • 

h  =  (/lx,  /12),  ja  ~  I5  2,  ...,  Tloiy  ha  =  I/Hq,)  (7.4.1) 

The  cell  with  centre  Xj  is  called  ftj.  Note  that  in  a  cell-centered  grid  there  are  no  grid  points 
on  the  boundary  9ft. 

Finite  volume  discretization  in  two  dimensions 

Integration  of  (7.2.1)  over  a  finite  volume  ftj  gives,  with  c  =  s  =  0  for  brevity, 

-/  <^apuanpdr  +  J  baUHadT  =  0  (7.4.2) 

r>  r, 

with  Ej  the  boundary  of  ftj  and  n  the  outward  unit  normal.  Let  us  (denote)  the  ’’east”  part 
of  Tj,  at  xi  =  (ji  +  l)hi,  by  Tg.  On  re,n  =  (1,0). 

The  convection  term 
We  write 

1 6iudr^/i2(/>iu)j,+i/2,j,  (7.4.3) 

Te 
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>  I » I  ■  I  ■  J 

- ^ 

•  •  •  •  •  ’ 


•  •  • 


•  • 


■  ■ 


Figure  7.4.1:  Cell- centered  grid.  (•  grid  points; 


finite  volume  boundaries.) 


Central  discretization  gives 


Upwind  discretization  gives 


-  o{(^l  +  +  9{(®1  “ 


(7.4.4) 


(7.4.5) 


If  a\2  =  0  then  (7.4.4)  results  in  a  lif-matrix  (definition  3.2.5)  only  if  with  w  defined  below 


^I^jl+l.j2  I  2  ^I^jl-l|j2  I  ^  2 

^ii+i/2,i2  ^ii-i/2j2 


(7.4.6) 


whereas,  if  ai2  =  0,  (7.4.5)  always  results  in  a  /T-matrix.  The  advantages  of  having  a  ir-matrix 


Monotonicity:  absence  of  numerical  wiggles. 


•  Good  behaviour  of  iterative  solution  methods,  including  multigrid,  as  discussed  in  Chap¬ 
ter  4. 


The  diffusion  term 
We  write 


J  aoiiU^ocdT  —  ^2(®al'^,a)ji+l/2j2 


(7.4.7) 
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and  approximate  the  flux  Fj^+i/2,j2  ~  (®ai'“,a)ji+i/2,i2  ^  similar  way  as  in  the  one¬ 
dimensional  case,  taking  into  account  the  fact  that  (w,i)j+i/2,i2  exist,  but  F  and  u^2 

are  smooth.  We  have 


,h 

I 


/  u,ida;i  =  /  ^(anui)dxi 

,32  ^31 ,32 


®il  + 1,1/2 

/  -  ai2U,2}rfa;i 

^il  ,32 

®J1+1/2,/2  =^11+1.1/2 

^ji+i/2,j2  /  ~  (“,2)ji+i/2,i2  /  ai2laudxi 


(7.4.8) 


We  now  assume  that  ai2  =  0,  or  else  that  u^2  may  be  approximated  by 


(w,2)ii-(-l/2,i 


1  f  |jl,j2-H 
4/i2^  IJ1.J2-1 


|ii+i  J2+I1 


(7.4.9) 


This  wiU  be  accurate  only  of  aap  is  smooth  at  the  north  and  south  edges,  the  general  case 
seems  not  to  have  been  investigated. 


We  obtain,  with 


'^ii+i/2,i2  = 


^>i+i.i2 


.J2 


(7.4.10) 


^>i+iji2 

^ii+i/2,j2  - '^ii +1/2, i2“ljW2’^V^  + (“,2)11 +1/2, i2  j  ai2/audxi 

^Ji  th 

With  (7.4.9)  the  if-matrix  property  is  lost.  The  off-diagonal  elements  with  the  wrong  side 
are  much  smaller  than  those  generated  by  central  discretization  of  the  convection  term  at 
high  Peclet  number,  and  usually  results  obtained  and  performance  of  iterative  methods  are 
still  satisfactory.  See  [141]  for  more  details,  including  a  discretization  that  gives  a  fir-matrix 
for  ai2  ^  0. 


7.5  A  hyperbolic  system 
Hyperbolic  system  of  conservation  laws 

In  this  section  we  consider  the  following  hyperbolic  system  of  conservation  laws: 


du  df(u)  dg{u) 

dt  dx  dy  ’ 


{x,y)  e  n. 


te{o,T] 


(7.5.1) 
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where 


u:[0,r]xn-^5a  CiR^  5:[0,T]x^i^iR^  f,g:Sa-^M^  (7.5.2) 

Here  Sa  is  the  set  of  admissible  states.  For  example,  if  one  of  the  p  unknowns,  Ui  say,  is 
the  fluid  density  or  the  speed  of  sound  in  a  fluid  mechanics  application,  then  Ui  <  0  is  not 
admissible.  Equation  (7.5.1)  is  a  system  of  p  equations  with  p  unknowns.  Here  we  abandon 
Cartesian  tensor  notation  for  the  more  convenient  notation  above.  Equation  (7.5.1)  is  assumed 
to  be  hyperbolic. 

Definition  7.5.1  Equation  (7.5.1)  is  called  hyperbolic  with  respect  to  t  if  there  exist  for  all 
(f  e  [0, 27r)  and  admissible  u  a  real  diagonal  matrix  D(tt,  (p)  and  non-singular  matrix  R{u,  (p) 
such  that 

A(u,  (p)R{u^  p)  =  i?(tt,  p)D{u^  p)  (7.5.3) 

where 

A{u,  p)  —  cos  +  sin  -  (7.5.4) 

The  main  example  to  date  of  systems  of  type  (7.5.1)  to  which  multigrid  methods  have  been 
applied  successfully  are  the  Euler  equations  of  gas  dynamics.  See  [34]  for  more  details  on 
the  mathematical  properties  of  these  equations  and  of  hyperbolic  systems  in  general.  For 
numerical  aspects  of  hyperbolic  systems,  see  [101]  or  [104]. 

For  the  discretization  of  (7.5.1),  schemes  of  Law-Wendroff  type  (see  [101])  have  long  been  pop¬ 
ular  and  stiU  are  widely  used.  These  schemes  are  explicit  and,  for  time- dependent  problems, 
there  is  no  need  for  multigrid:  stability  and  accuracy  restrictions  on  the  time  step  At  are 
about  equaly  severe.  If  the  time- dependent  formulation  is  used  solely  as  a  means  to  compute 
a  steady  state,  then  one  would  like  to  be  unrestricted  in  the  choice  of  At  ands/or  use  artificial 
means  to  get  rid  of  the  transients  quickly. 

In  [92]  a  method  has  been  proposed  to  do  this  using  multiple  grids.  This  method  has 
been  developed  further  in  [76],  [30]  and  [77].  The  method  is  restricted  to  Lax-Wendroff  type 
formulations. 


Finite  volume  discretization 

Following  the  main  trend  in  contemporary  computational  fluid  dynamics,  we  discuss  only  the 
ceU-centered  case.  The  grid  is  given  in  Figure  7.4.1.  Integration  of  (7.5.1)  over  flj  gives,  using 
the  Gauss  divergence  theorem. 


^  J  ud^  -h  +  g(u)ny)dT  =  J 

Qj  Fj  ilj 


SdQ, 


(7.5.5) 
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where  F,-  is  the  boundary  of  Q,-.  With  the  approximations 


j  udQ,  ~  J  SdQ,  ~  |%|sj 


(7.5.6) 


where  |flj|  is  the  area  of  Equation  (7.5.5)  becomes 


\ilj\duj/dt  +  J  {f{u)nx  +  g{u)ny)dT  =  |Oj|sj 


(7.5.7) 


The  time  discretization  will  not  be  discussed.  The  space  discretization  takes  place  by  approx¬ 
imating  the  integral  over  Fj . 

Let  A-Xj  +  {hi/2, B  =  Xj-\-  (/ii/2,  /i2/2),  so  that  AB  is  part  of  Tj.  On  AB,  Ux  =  1 


and  Tiy  =  0.  We  write 


J  f{u)dX2  =  h2f{u)c 


with  C  the  midpoint  of  AB.  Central  space  discretization  is  obtained  with 

/(■“)c  =  \f{Uj)  + 


(7.5.8) 


(7.5.9) 


In  the  presence  of  shocks,  this  does  not  lead  to  the  correct  weak  solution,  unless  thermo¬ 
dynamic  irreversibility  is  enforced.  This  may  be  done  by  introducing  artificial  viscosity,  an 
approach  followed  in  [72].  Another  approach  is  to  use  upwind  space  discretization,  obtained 
by  flux  splitting: 

f{u)  =  /+(-u)  -I-  f~(u)  (7.5.10) 

with  f^{u)  choosen  such  that  the  eigenvalues  of  the  Jacobians  of  /^(m)  satisfy 

X{df+/du)  >  0,  X{df-ldu)  <  0  (7.5.11) 

There  are  many  splittings  satisfying  (7.5.11).  For  a  survey  of  flux  splitting,  see  [64]  and  [126]. 
With  upwind  discretization,  f(u)o  is  approximated  by 


f{u)c  =  f^iUj  +  /  {Uj+e,  )) 


(7.5.12) 


The  implementation  of  boundary  conditions  for  hyperbolic  systems  is  not  simple,  and  will 
not  be  discussed  here;  the  reader  is  referred  to  the  literature  mentioned  above. 

Exercise  7.5.1  Show  that  the  flux  splitting  (7.4.5)  satisfies  (7.5.11). 
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8  Conclusion 


An  introduction  has  been  presented  to  the  application  of  multigrid  methods  to  the  numerical 
solution  of  elliptic  and  hyperbolic  partial  differential  equations. 

Because  robustness  is  stongly  influenced  by  the  smoothing  method  used,  much  attention  has 
been  given  to  smoothing  analysis,  and  many  possible  smoothing  methods  have  been  presented. 

An  attempt  has  been  made  to  review  much  of  the  literature,  to  help  the  reader  to  find  his 
way  quickly  to  material  relevant  to  his  interests.  For  more  information,  see  [141]. 

In  this  book  application  of  multigrid  to  the  eqations  of  fluid  dynamics  is  reviewed,  a  topic 
not  covered  here.  There  the  full  potential  equation,  the  Euler  equations,  the  compressible 
Navier-Stokes  equations  and  the  incompressible  Navier-Stokes  and  Boussinesq  equations  are 
treated. 

The  principles  discussed  in  these  notes  hold  quite  generally,  making  solution  possible  at  a  cost 
of  a  few  work  units,  as  discussed  in  chapter  6,  for  problems  more  difficult  than  considered 
here. 
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